Data imbalance is one of the main challenges in developing disease classification models, as it can cause algorithms to recognize the majority class more dominantly and perform less optimally in detecting positive cases. This study aims to analyze the application of the combination of Synthetic Minority Over-sampling Technique (SMOTE) and XGBoost in measles disease classification. The data used consisted of 1,000 records with clinical features including age, immunization history, fever, cough, runny nose, conjunctivitis, skin rash, and measles status. The research data were divided into two subsets, namely 80% for the model training process and 20% for testing. The SMOTE technique was applied to the training data to address class distribution imbalance, while the XGBoost algorithm was used to build the classification model. Model performance was then evaluated using a confusion matrix and the metrics of accuracy, precision, recall, and F1-score. The results showed that XGBoost without SMOTE achieved an accuracy of 94.0%, precision of 83.3%, recall of 50.0%, and F1-score of 62.5%. After applying SMOTE, the performance improved, with an accuracy of 97.0%, precision of 79.2%, recall of 95.0%, and F1-score of 86.4%. These results indicate that the combination of SMOTE and XGBoost is more effective in improving the detection capability of positive measles cases in imbalanced data..
Copyrights © 2026