This study aims to develop a domain-aware legal Question-Answering (QA) system tailored for Indonesia’s Micro, Small, and Medium Enterprises (MSMEs) by proposing a hybrid Retrieval-Augmented Generation (RAG) framework that integrates Term Frequency–Inverse Document Frequency (TF-IDF), Knowledge Graph (KG), and Large Language Model (LLM) components. In this framework, TF-IDF contributes by performing lexical-level retrieval to identify the most relevant documents based on keyword weighting; the KG enriches this retrieval by providing semantic relationships among legal entities, enabling deeper contextual understanding; and the LLM generates coherent responses conditioned on both lexical and semantically grounded evidence. Together, these components work synergistically to strengthen factual grounding during retrieval and improve contextual reasoning during generation. Methodologically, the system processes a curated dataset of 1,400 legal question–answer pairs collected from national legal repositories, including legislation, government regulations, and MSME digitalization guidelines. The process includes text preprocessing, keyword extraction using TF-IDF, semantic enrichment through a KG that maps legal entities and their relationships, and answer generation via an LLM powered by the RAG pipeline. The system was evaluated using Precision, Recall, F1-Score, Bilingual Evaluation Understudy (BLEU), and Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metrics, validated by five legal experts. Results show an accuracy improvement from 76.5% to 83.5% after integrating KG, with Precision of 0.853, Recall of 0.877, and F1-Score of 0.865. The generative evaluation yielded a BLEU score of 0.9276 and ROUGE-L of 0.9301, indicating strong linguistic and semantic alignment between system outputs and expert-authored references. The study concludes that this approach offers a practical foundation for building AI-based legal assistance tools and highlights future opportunities for expansion to other legal domains and multilingual RAG applications.
Copyrights © 2026