Poverty remains a major challenge in Indonesia, with a national rate reaching 9.36 percent in 2023, despite significant disparities between rural (12.22 percent) and urban (7.29 percent) areas, as well as the influence of outlier that can distort classification analysis at the district/city level. This study aims to classify poverty levels in 514 districts/cities into high (above 9.36 percent) and low (below or equal to 9.36 percent) categories using logistic regression, and to compare the model performance on original data with outlier-adjusted data through Z-score and interquartile range (IQR) methods. The methods applied include the collection of secondary data from the Central Statistics Agency and the Ministry of Home Affairs, exploratory data analysis to identify patterns and correlations (such as the negative correlation between per capita expenditure and poverty), and pre-processing by capping outlier. logistic regression training with hyperparameter tuning through grid search and cross-validation, as well as evaluation using accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (ROC-AUC) metrics. The predictor variables include gross domestic product (GDP), life expectancy, average length of schooling, and per capita expenditure. The results show consistent performance across techniques, with test accuracy reaching 77.67 percent, ROC-AUC of 0.8566, macro precision of 77.90 percent, macro recall of 77.79 percent, and macro F1-score of 77.66 percent. Outlier handling reduced the poverty rate standard deviation from 6.45 to 5.99 (Z-score) and 5.57 (IQR), without changing the distribution of binary labels (266 low, 248 high). The model coefficients confirm the dominant negative influence of per capita expenditure (-1.067), supporting targeted policies to reduce regional disparities.
Copyrights © 2025