Poverty remains a multidimensional challenge in Central Java, necessitating robust data-driven approaches to identify its socioeconomic determinants. This study applied six machine learning models, specifically Extreme Gradient Boosting (XGBoost), Random Forest, CatBoost, LightGBM, Elastic Net Regression, and a Stacking ensemble using district-level data from Statistics Indonesia covering demographics, education, labor, infrastructure, and household welfare. Model evaluation combined an 80:20 hold-out split, 10-fold cross-validation, and noise perturbation tests. Results show that XGBoost achieved the best individual performance (MAE = 2,180.01; RMSE = 3,512.07; R² = 0.931), while the Stacking ensemble surpassed all single learners (MAE = 2,640.99; RMSE = 3,202.79; R² = 0.942). Interpretability was ensured through SHAP (Shapley Additive Explanations), Partial Dependence Plots (PDP), and Accumulated Local Effects (ALE), consistently identifying Number of Households, Per Capita Expenditure, and Uninhabitable Houses as the most influential predictors. Counterfactual simulations indicated that increasing per capita expenditure by 10% could reduce the poverty index by 9.9%, while reducing household size by 10% lowered it by 11.3%. Robustness checks revealed Brebes as an influential district shaping model stability. Overall, the findings demonstrate that boosting and stacking ensembles, when combined with explainable AI tools, not only enhance predictive accuracy but also provide transparent, policy-relevant evidence to strengthen poverty alleviation programs in Central Java. This study contributes both methodological advances in explainable machine learning and practical insights for targeted poverty reduction strategies.
Copyrights © 2025