INOVTEK Polbeng - Seri Informatika
Vol. 10 No. 3 (2025): November

An Indonesian Chatbot for Disease Diagnosis Using Retrieval-Augmented Generation

Muhammad Adrinta Abdurrazzaq (Unknown)
Edwin Lesmana Tjiong (Unknown)
Aulia Fasya (Unknown)
Michelle Hiu (Unknown)
Joses Tanuwidjaya (Unknown)



Article Info

Publish Date
26 Nov 2025

Abstract

The rapid advancement of Large Language Models (LLMs) has enabled their use in medical information systems, although challenges such as hallucinations, domain mismatches, and the lack of a verified knowledge base remain significant, particularly in low-source languages ​​like Indonesian. This study introduces an Indonesian-language medical chatbot based on the open-source GPT-OSS-20B model enhanced through a Retrieval-Augmented Generation (RAG) pipeline. The system combines semantic retrieval using jina-embeddings-v3, lexical re-ranking with the BM25 algorithm, and a lightweight Logistic Regression-based domain filter as an initial filter to prevent out-of-domain LLM usage. Evaluation using Indonesian medical articles and annotated patient-doctor conversations shows that the domain filter works well on synthetic data but results in misclassification of natural queries. A hybrid weighted reranker (FAISS L2 + BM25) performed the best with a Top-30 accuracy of 0.699. Black-box testing indicates that the system flow functions as designed, although the response quality has not been validated by clinical experts. These findings suggest that RAG-based open-source LLMs can improve access to Indonesian-language medical information, but still have important limitations such as the lack of clinical validation, potential errors in scraped data, and suboptimal robustness of domain filters.

Copyrights © 2025






Journal Info

Abbrev

ISI

Publisher

Subject

Computer Science & IT

Description

The Journal of Innovation and Technology (INOVTEK Polbeng—Seri Informatika) is a distinguished publication hosted by the State Polytechnic of Bengkalis. Dedicated to advancing the field of informatics, this scientific research journal serves as a vital platform for academics, researchers, and ...