Stunting is a chronic nutritional problem that poses serious long-term effects on children’s health, including impaired physical growth, delayed cognitive development, and reduced productivity in adulthood. Early and accurate detection of stunting is therefore essential to support effective public health interventions and targeted policy implementation. However, one of the central challenges in developing machine learning models for this purpose is the presence of class imbalance in health-related datasets. Such imbalance frequently leads to biased classifiers that perform well on majority classes but fail to identify minority categories, reducing the overall reliability of the system. To overcome this issue, the present study utilized the Synthetic Minority Oversampling Technique (SMOTE) to balance the distribution of classes in a dataset containing 110,000 records. A Random Forest algorithm was then employed as the base classifier, with hyperparameter optimization carried out using the Optuna framework to ensure robustness and generalizability. The experimental results demonstrate that the combined application of SMOTE and Optuna significantly improved classification performance, producing the highest Macro Area Under the Curve (AUC) of 0.9972. This outstanding score indicates the model’s superior ability to distinguish nutritional status categories across both majority and minority classes. The study concludes that addressing data imbalance through oversampling is a fundamental methodological step in constructing fair and effective machine learning systems for stunting detection, ultimately contributing to improved health outcomes and evidence-based policy design.
Copyrights © 2025