BAREKENG: Jurnal Ilmu Matematika dan Terapan
Vol 20 No 3 (2026): BAREKENG: Journal of Mathematics and Its Application

COMPARATIVE STUDY OF LIGHTGBM, CATBOOST, AND RANDOM FOREST IN MODELING PUBLIC COMPLAINTS CLASSIFICATION

Oktaviyani Daswati (Department of Statistics and Data Science, School of Data Science, Mathematics, and Informatics, IPB University, Indonesia)
Hari Wijayanto (Department of Statistics and Data Science, School of Data Science, Mathematics, and Informatics, IPB University, Indonesia)
Farit Mochamad Afendi (Department of Statistics and Data Science, School of Data Science, Mathematics, and Informatics, IPB University, Indonesia)



Article Info

Publish Date
08 Apr 2026

Abstract

Public complaints data on maladministration in Indonesia is a dataset with high-cardinality categorical variables and imbalanced category distributions, posing significant challenges for conventional machine learning algorithms. To address this issue, this study aims to evaluate and compare the performance of three widely used classification algorithms (LightGBM, CatBoost, and Random Forest) on actual public complaint data that has never been analysed using machine learning methods. Hyperparameter tuning was applied to obtain optimal configurations and ensure robust performance. Analysis was conducted using 30 repeated simulations with accuracy and sensitivity as the primary metrics. ANOVA followed by Tukey HSD was used to explicitly determine whether there were differences in performance between models at a 95% confidence level. The results show that LightGBM performed best with an accuracy of 74.50% and a sensitivity of 76.70%, followed by CatBoost with an accuracy of 74.12% and a sensitivity of 75.54%, while Random Forest lagged far behind. Statistical tests confirmed significant performance differences between the three models. This study is not without limitations. Only three classification algorithms were evaluated, encoding strategies were not systematically compared, and the hyperparameter search space was restricted, meaning broader model exploration may yield improved performance. Nonetheless, the study provides originality and value by representing the first empirical application of machine learning to Indonesian public complaint data on maladministration, demonstrating how algorithm selection directly affects predictive outcomes when handling complex categorical structures. The findings offer practical insights for government agencies, highlighting how data-driven models can support policy design, strengthen transparency, and improve the quality of public services.

Copyrights © 2026






Journal Info

Abbrev

barekeng

Publisher

Subject

Computer Science & IT Control & Systems Engineering Economics, Econometrics & Finance Energy Engineering Mathematics Mechanical Engineering Physics Transportation

Description

BAREKENG: Jurnal ilmu Matematika dan Terapan is one of the scientific publication media, which publish the article related to the result of research or study in the field of Pure Mathematics and Applied Mathematics. Focus and scope of BAREKENG: Jurnal ilmu Matematika dan Terapan, as follows: - Pure ...