Poverty is a major issue faced by Central Java Province, with rates fluctuating annually. To respond to and address this challenge more effectively, a predictive, data-driven approach is essential. This study applies machine learning techniques to forecast the number of people living in poverty in 2024 at the district/city level, utilizing socio-economic data from 2019 to 2023 provided by the Central Bureau of Statistics (BPS). Seven indicators are used as predictor variables, including the poverty line, the number and percentage of people living in poverty, the open unemployment rate, average years of schooling, the Human Development Index, and the regional minimum wage. The data were normalized using StandardScaler and split into training (80%) and testing (20%) sets. This study compares three regression algorithms—Linear Regression, Random Forest, and XGBoost—to evaluate their effectiveness in modeling the complexity of socio-economic data. The analysis reveals that XGBoost delivers the best performance, with a Mean Absolute Error (MAE) of 6,665 and an R² score of 0.978, outperforming Random Forest (MAE: 9,209; R²: 0.947) and Linear Regression (MAE: 10,917; R²: 0.896). By comparing these models, the study addresses a gap in the literature regarding the effectiveness of machine learning models for local-level poverty prediction. The findings suggest that XGBoost holds strong potential as a data-driven policy support tool, particularly in poverty alleviation planning and decision-making at the regional level.
Copyrights © 2025