Stunting is a form of chronic nutritional deficiency in toddlers and remains a major public health concern due to its impact on child growth and development. Efforts to reduce its prevalence continue to be strengthened in Indonesia, particularly in Sumatra Province. This study aims to evaluate the accuracy of a logistic regression model and three machine learning models—decision tree, random forest, and Support Vector Machine (SVM)—in classifying stunting prevalence. The response variable is the prevalence of stunting among toddlers and is categorized into two classes: exceeding the national target and not exceeding it, based on the 2024 national threshold. Although classification models can provide accurate predictions, they often lack interpretability. Therefore, this study applies the Shapley Additive exPlanations (SHAP) method to the best-performing machine learning model to identify the key factors influencing stunting. The use of Shapley values is justified through the uniqueness theorem, which establishes it as the only attribution method that satisfies desirable fairness properties. SHAP values explain the model by referencing both the trained model and the underlying data. The results show that the random forest model achieves the highest accuracy (90.00%) and outperforms the other models. SHAP analysis reveals that Underweight is the most influential predictor contributing to stunting prevalence in Sumatra Province. These findings highlight the importance of machine learning interpretability in supporting policy decisions to reduce stunting.
Copyrights © 2026