Journal of Applied Data Sciences
Vol 7, No 2: May 2026

Performance Evaluation of Support Vector Machine (SVM) and XGBoost for Predicting Toddlers’ Stunting Status Based on Anthropometric Data

Nurjoko, Nurjoko (Unknown)
Syarif, Admi (Unknown)
Lumbanraja, Favorisen R. (Unknown)
Berawi, Khairunisa (Unknown)



Article Info

Publish Date
13 May 2026

Abstract

Stunting remains a primary global health concern, particularly in developing countries, due to its long-term effects on physical growth, cognitive development, and overall well-being. Despite various public health initiatives, challenges in early detection persist, highlighting the need for accurate, data-driven predictive models to support targeted interventions. This study aims to develop and compare the performance of two machine learning algorithms—SVM and Extreme Gradient Boosting (XGBoost)—for classifying stunting status among children under five, in order to determine the most effective method for early prediction. A quantitative machine learning approach was applied to a dataset comprising 17,498 records derived from Posyandu data in Lampung Province, Indonesia. The analytical pipeline included data preprocessing, class rebalancing using the Synthetic Minority Over-sampling Technique (SMOTE), and model evaluation through stratified 10-fold cross-validation. Performance was assessed using accuracy, precision, recall, and F1-score. The XGBoost model demonstrated superior performance with accuracy, precision, recall, and F1-score reaching 0.9979. In comparison, the SVM model produced slightly lower yet still strong results, achieving an accuracy of 0.9949, with similarly consistent performance across other evaluation metrics. These findings indicate that XGBoost more effectively handles high-dimensional, imbalanced data and captures nonlinear patterns in the dataset. XGBoost was identified as the optimal method for stunting classification in this study, outperforming SVM across all evaluation metrics. These results support the integration of boosting-based models into early detection systems for child nutritional assessment. Future studies should incorporate additional environmental and socioeconomic variables and evaluate model applicability in a real-time community health setting.

Copyrights © 2026






Journal Info

Abbrev

JADS

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes ...