Lontar Usada Rare is a traditional Balinese manuscript containing pediatric medical knowledge based on local wisdom, yet its narrative format limits accessibility and utilization in modern contexts, while its physical fragility threatens long-term preservation. This study aims to develop a pediatric disease classification model using a Support Vector Machine (SVM) combined with Term Frequency–Inverse Document Frequency (TF-IDF) weighting to support the digitalization of Balinese traditional medicine. A total of 422 data samples were collected through expert interviews and manuscript analysis, covering symptoms, disease types, herbal ingredients, and treatment procedures. The research stages included text preprocessing (cleansing, tokenizing, stopword removal, stemming), manual labeling into 35 disease classes, and model evaluation using five train–test split ratios (80:20 to 60:40) with variations of the complexity parameter C (0.5, 1, 10, 100, 1000). The best performance was achieved using C=10 with an 80:20 ratio, resulting in 87.06% accuracy, 91.55% precision, 87.06% recall, and an F1-score of 87.96%. Confusion matrix analysis showed strong classification performance for most classes, although minority classes with overlapping symptoms exhibited misclassification. Overall, the TF-IDF and linear SVM combination effectively classifies pediatric disease symptoms from Lontar Usada Rare and contributes to the preservation and digital transformation of Balinese traditional medical knowledge for potential modern healthcare applications.
Copyrights © 2026