JOIN (Jurnal Online Informatika)
Vol 10 No 1 (2025)

LLM-Based Information Retrieval for Disease Detection Using Semantic Similarity

Muhammad Adrinta Abdurrazzaq (Unknown)
Edwin Lesmana Tjiong (Unknown)
Kent Algren Wanady (Unknown)



Article Info

Publish Date
01 Apr 2025

Abstract

Information retrieval systems are vital for disease prediction, but traditional methods like TF-IDF struggle with word meanings and produce long, complex vectors. This research uses Large Language Models (LLMs) and follows the CRISP-DM methodology to improve accuracy. Using health forum discussions labeled with specific diseases, we split the data into queries and a corpus. Semantic similarity is used to retrieve the most relevant text from the corpus. After preprocessing, we compare LLMs and TF-IDF, with LLMs achieving an accuracy of 0.911 (Top-K=30), outperforming TF-IDF. LLMs excel by creating shorter, meaningful vectors that preserve context, enabling precise semantic matching. These results demonstrate LLMs' potential to enhance healthcare information retrieval, offering more accurate and context-aware solutions. This research highlights how advanced AI can overcome traditional methods' limitations, opening new possibilities for medical informatics.

Copyrights © 2025






Journal Info

Abbrev

join

Publisher

Subject

Computer Science & IT

Description

JOIN (Jurnal Online Informatika) is a scientific journal published by the Department of Informatics UIN Sunan Gunung Djati Bandung. This journal contains scientific papers from Academics, Researchers, and Practitioners about research on informatics. JOIN (Jurnal Online Informatika) is published ...