Claim Missing Document
Check
Articles

Found 4 Documents
Search

Parameter-efficient fine-tuning of small language models for code generation: a comparative study of Gemma, Qwen 2.5 and Llama 3.2 Nguyen, Van-Viet; Nguyen, The-Vinh; Nguyen, Huu-Khanh; Vu, Duc-Quang
International Journal of Electrical and Computer Engineering (IJECE) Vol 16, No 1: February 2026
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijece.v16i1.pp278-287

Abstract

Large language models (LLMs) have demonstrated impressive capabilities in code generation; however, their high computational demands, privacy limitations, and challenges in edge deployment restrict their practical use in domain-specific applications. This study explores the effectiveness of parameter efficient fine-tuning for small language models (SLMs) with fewer than 3 billion parameters. We adopt a hybrid approach that combines low-rank adaptation (LoRA) and 4-bit quantization (QLoRA) to reduce fine-tuning costs while preserving semantic consistency. Experiments on the CodeAlpaca-20k dataset reveal that SLMs fine-tuned with this method outperform larger baseline models, including Phi-3 Mini 4K base, in ROUGE-L. Notably, applying our approach to the LLaMA 3 3B and Qwen2.5 3B models yielded performance improvements of 54% and 55%, respectively, over untuned counterparts. We evaluate models developed by major artificial intelligence (AI) providers Google (Gemma 2B), Meta (LLaMA 3 1B/3B), and Alibaba (Qwen2.5 1.5B/3B) and show that parameter-efficient fine-tuning enables them to serve as cost-effective, high-performing alternatives to larger LLMs. These findings highlight the potential of SLMs as scalable solutions for domain-specific software engineering tasks, supporting broader adoption and democratization of neural code synthesis.
Enhancing Autonomous GIS with DeepSeek-Coder: an open-source large language model approach Nguyen, Kim-Son; Nguyen, The-Vinh; Nguyen, Van-Viet; Thi, Minh-Hue Luong; Nguyen, Huu-Khanh; Nguyen, Duc-Binh
International Journal of Electrical and Computer Engineering (IJECE) Vol 16, No 1: February 2026
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijece.v16i1.pp423-436

Abstract

Large language models (LLMs) have paved a way for geographic information system (GIS) that can solve spatial problems with minimal human intervention. However, current commercial LLM-based GIS solutions pose many limitations for researchers, such as proprietary APIs, high operational costs, and internet connectivity requirements, making them inaccessible in resource-constrained environments. To overcome this, this paper introduced the LLM-Geo framework with the DS-GeoAI platform, integrating the DeepSeek-Coder model (the open-source, lightweight version deepseek-coder-1.3b-base) running directly on Google Colab. This approach eliminates API dependence, thus reducing deployment costs, and ensures data independence and sovereignty. Despite having only 1.3 billion parameters, DeepSeek-Coder proved to be highly effective: generating accurate Python code for complex spatial analysis, achieving a success rate comparable to commercial solutions. After an automated debugging step, the system achieved 90% accuracy across three case studies. With its strong error- handling capabilities and intelligent sample data generation, DS-GeoAI proves highly adaptable to real-world challenges. Quantitative results showed a cost reduction of up to 99% compared to API-based solutions, while expanding access to advanced geo-AI technology for organizations with limited resources.
Automated data exploration with mutual information in natural language to visualization Luong, Hue Thi-Minh; Nguyen, Vinh-The; Nguyen, Van-Viet; Nguyen, Kim-Son; Nguyen, Huu-Khanh
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 15, No 1: February 2026
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v15.i1.pp129-139

Abstract

Transcribing natural language to visualization (NL2VIS) has been investigated for years but still suffer from several fundamental limitations (e.g., feature selection). Although large language models (LLMs) are good candidates but they incur computation cost and hard to trace their made decisions. To alleviate this problem, we introduced an alternative information-theoretic framework that utilized mutual information (MI) to quantify the statistical relationship between utterances and database features. In our approach, kernel density estimation (KDE) and neural estimation techniques were utilized to estimate MI, and to optimize a diversity-promoting objective balancing feature relevance and redundancy. We also introduced the information coverage ratio (ICR) to quantify the amount of information content preserved in feature selection decisions. In our experiments, we found that the proposed approach improved information-theoretic metrics, with F1-score of 0.863 and an ICR of 0.891. We observed that these improvements did not come at the cost of traditional benchmarks: validity reached 88.9%, legality 85.2%, and chart-type accuracy 87.6%. Moreover, significance tests (p < 0.001) and large effect sizes (Cohen’s d > 0.8) further supported that these improvements were meaningful for feature selection. Thus, this study provides a mathematical framework for applications requiring analytical validity that extends beyond NL2VIS to other machine learning contexts.
From Feature Description to UML Architecture: A Novel Framework for Automated Reasoning and Multimodal Evaluation of Component and Deployment Diagram Nguyen, Van-Viet; Nguyen, Huu-Khanh; Nguyen, Kim-Son; Luong, Thi Minh-Hue; Bui, Anh-Tu; Vu, Duc-Quang; Nguyen, The-Vinh
Journal of Information Systems Engineering and Business Intelligence Vol. 12 No. 1 (2026): February
Publisher : Universitas Airlangga

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

Background: Unified Modeling Language (UML) is fundamental to software architecture, yet the automated generation of high-level diagrams remains underexplored. Specifically, Component and Deployment diagrams pose significant challenges due to their high abstraction and complex architectural dependencies, which are difficult to infer from natural language descriptions alone. Objective: This study aimed to develop and validate a novel, end-to-end framework to bridge the gap between natural language feature descriptions and executable UML architectural diagrams. The primary goal was to fully automate the pipeline, from requirement generation to robust, multimodal validation of the final visual outputs. Methods: A quantitative study was conducted using a three-stage automated pipeline. First, LLaMA 3.2-1B-Instruct generated diverse feature descriptions. Second, DeepSeek-R1-Distill-Qwen-32B performed advanced reasoning to synthesize executable PlantUML code for Component and Deployment diagrams. Finally, a novel multimodal validation framework was introduced, employing an ensemble of three vision-language models—Qwen2.5-VL-3B, LLaMA-3.2-11B-Vision, and Aya-Vision-8B—to quantitatively assess the fidelity of the generated diagrams against their source descriptions. Results: Our framework demonstrated high fidelity in accurately capturing both system modularity (Component diagrams) and runtime allocation (Deployment diagrams). The reasoning-driven synthesis by DeepSeek-R1 significantly outperformed baseline models in generating architecturally correct diagrams. The multimodal evaluation pipeline effectively reduced scoring bias by integrating diverse validation perspectives. A key outcome is the creation of a systematically generated benchmark dataset of architectural diagrams. Conclusion: This study successfully establishes the viability of an AI-driven pipeline for automated UML architecture generation and validation. It provides three key contributions: the first fully automated pipeline for this task, a novel multimodal validation method, and a public benchmark dataset. This work lays a foundation for practical, AI-powered software architecture modeling. Future work should extend this framework to encompass behavioral UML diagrams.