Machine learning classification is an effective tool for categorizing data based on patterns, which is particularly useful in analyzing the Human Development Index (HDI) in Indonesia. HDI serves as a key indicator of regional development progress, making it crucial to classify HDI categories at the regency/city level to support targeted development planning. This study aims to compare the performance of three ensemble-based classification methods—Random Forest, XGBoost, and LightGBM—in classifying HDI categories in Indonesia. Data from the Central Bureau of Statistics (BPS) in 2023, comprising 514 observations across nine variables, was used for analysis. The study applied these algorithms to analyze the most influential variables affecting HDI. The results show that LightGBM outperformed both Random Forest and XGBoost, achieving an accuracy of 0.937 without outlier handling and 0.944 with outlier handling. Additionally, per capita expenditure was identified as the most influential factor in predicting HDI. These findings contribute to the field of statistical modeling by demonstrating how ensemble methods can improve classification accuracy and provide valuable insights for data-driven policymaking, thus enhancing regional development planning and supporting future HDI-related research.
Copyrights © 2025