Claim Missing Document
Check
Articles

Found 3 Documents
Search

Analisis Ulasan Aplikasi MyPertamina Menggunakan Topic Modeling dengan Latent Dirichlet Allocation Muhammad Adrinta Abdurrazzaq
KALBISCIENTIA Jurnal Sains dan Teknologi Vol. 10 No. 1 (2023): Sains dan Teknologi
Publisher : Research and Community Service INSTITUT TEKNOLOGI DAN BISNIS KALBIS

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.53008/kalbiscientia.v10i1.694

Abstract

The policy for using the MyPertamina application in purchasing subsidized fuel oil begins on July 1, 2022. Until August 17, 2022, the enormous demand for fuel in the community made this policy concern. Application developers can update applications according to user complaints and reviews. This study aims to map the problems in the MyPertamina application that many users complain about using topic modeling with the LDA algorithm. The study's results divided the comments into 7 with a perplexity value of -8.08844 and a topic coherence of 0.49860. These topics are related to user anxiety about the MyPertamina application usage policy, problems in the registration process, problems in the authentication process, and problems in the smooth use of applications related to bugs and signals. There are 5341 comments considered spam that does not correlate with the MyPertamina application.
LLM-Based Information Retrieval for Disease Detection Using Semantic Similarity Muhammad Adrinta Abdurrazzaq; Edwin Lesmana Tjiong; Kent Algren Wanady
JOIN (Jurnal Online Informatika) Vol 10 No 1 (2025)
Publisher : Department of Informatics, UIN Sunan Gunung Djati Bandung

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15575/join.v10i1.1486

Abstract

Information retrieval systems are vital for disease prediction, but traditional methods like TF-IDF struggle with word meanings and produce long, complex vectors. This research uses Large Language Models (LLMs) and follows the CRISP-DM methodology to improve accuracy. Using health forum discussions labeled with specific diseases, we split the data into queries and a corpus. Semantic similarity is used to retrieve the most relevant text from the corpus. After preprocessing, we compare LLMs and TF-IDF, with LLMs achieving an accuracy of 0.911 (Top-K=30), outperforming TF-IDF. LLMs excel by creating shorter, meaningful vectors that preserve context, enabling precise semantic matching. These results demonstrate LLMs' potential to enhance healthcare information retrieval, offering more accurate and context-aware solutions. This research highlights how advanced AI can overcome traditional methods' limitations, opening new possibilities for medical informatics.
An Indonesian Chatbot for Disease Diagnosis Using Retrieval-Augmented Generation Muhammad Adrinta Abdurrazzaq; Edwin Lesmana Tjiong; Aulia Fasya; Michelle Hiu; Joses Tanuwidjaya
INOVTEK Polbeng - Seri Informatika Vol. 10 No. 3 (2025): November
Publisher : P3M Politeknik Negeri Bengkalis

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35314/9nnkn955

Abstract

The rapid advancement of Large Language Models (LLMs) has enabled their use in medical information systems, although challenges such as hallucinations, domain mismatches, and the lack of a verified knowledge base remain significant, particularly in low-source languages ​​like Indonesian. This study introduces an Indonesian-language medical chatbot based on the open-source GPT-OSS-20B model enhanced through a Retrieval-Augmented Generation (RAG) pipeline. The system combines semantic retrieval using jina-embeddings-v3, lexical re-ranking with the BM25 algorithm, and a lightweight Logistic Regression-based domain filter as an initial filter to prevent out-of-domain LLM usage. Evaluation using Indonesian medical articles and annotated patient-doctor conversations shows that the domain filter works well on synthetic data but results in misclassification of natural queries. A hybrid weighted reranker (FAISS L2 + BM25) performed the best with a Top-30 accuracy of 0.699. Black-box testing indicates that the system flow functions as designed, although the response quality has not been validated by clinical experts. These findings suggest that RAG-based open-source LLMs can improve access to Indonesian-language medical information, but still have important limitations such as the lack of clinical validation, potential errors in scraped data, and suboptimal robustness of domain filters.