This research addresses the issue of low answer accuracy in chatbot systems based on Large Language Models (LLMs) when responding to questions derived from customer service documents. To overcome this problem, the Retrieval-Augmented Generation (RAG) method is applied to improve the quality of responses by adding relevant context from external documents. Three LLM models used in this study are LLaMA3.1 8B, LLaMA3.2 1B, and LLaMA3.2 3B from Meta AI. Evaluation is conducted using automatic ROUGE metrics (ROUGE-1, ROUGE-2, and ROUGE-L) and manual human evaluation assessing accuracy, relevance, and hallucination. This research contributes to the development of more reliable question-answering systems based on LLMs enhanced with external contextual documents related to customer service information. The results show a significant improvement across all models after applying the RAG method. ROUGE F1-scores increased consistently, with Llama3.1:8b showing the highest gain (from 0.12 to 0.58 on ROUGE-1). Human evaluation also confirmed improvements in accuracy (up to +2.73 points) and reductions in hallucination (up to −2.63 points). These improvements were evident not only in larger models but also in smaller ones, indicating that the benefits of RAG are not dependent on model size. In conclusion, RAG is highly effective in enhancing the accuracy and reliability of chatbot responses, especially in document-based question-answering scenarios. By leveraging contextual information from external documents, the system produces more factual, relevant, and hallucination-free responses. RAG has proven to be an effective approach for enhancing the response quality of LLM, including those with smaller parameter sizes.
Copyrights © 2025