This study aims to develop a Balinese traditional text-based pediatric disease classification model using a fine-tuned IndoBERT model on the Lontar Usada Rare dataset. The dataset used consists of 422 entries containing disease symptoms, disease types, medicinal ingredients, and treatment procedures obtained from transliteration of lontar manuscripts and interviews with traditional medicine experts. Pre-processing was done through case folding, cleansing, and normalization, followed by label encoding on 35 disease classes. The IndoBERT model was fine-tuned using the AdamW optimizer with a learning rate of 5e-5, batch size 8, and 15 epochs. Evaluation results showed the model was able to achieve 90.59% accuracy, 94.71% precision, 90.59% recall, and 90.99% F1-score, indicating excellent performance in understanding the linguistic context of traditional medical text. The developed recommendation system integrates model prediction with TF-IDF-based cosine similarity method to provide the most relevant treatment recommendations based on user symptom input. This research makes an important contribution to the digitization and preservation of Balinese traditional medical knowledge through the development of a structured and widely accessible digital knowledge base.
Copyrights © 2026