Ihya' Nashirudin Abrar
Master Program of Informatics, Universitas Ahmad Dahlan

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Comparative Evaluation of Boosting Ensemble Models for Medication Adherence Prediction in Patients with Non-Communicable Diseases Ihya' Nashirudin Abrar; Muhammad Kunta Biddinika; Herman Yuliansyah
Jurnal Sisfokom (Sistem Informasi dan Komputer) Vol. 15 No. 3 (2026): JULY
Publisher : ISB Atma Luhur

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32736/sisfokom.v15i3.2712

Abstract

Hypertension and diabetes mellitus are among the leading drivers of premature mortality worldwide. Long-term disease management depends critically on patient adherence to prescribed regimens; however, adherence rates in chronic-illness populations remain persistently low, particularly in developing regions. Although predictive studies on medication adherence have frequently employed Random Forest, Logistic Regression, and Support Vector Machines, a systematic benchmark of modern boosting ensembles on imbalanced clinical datasets has yet to be established. To address this gap, the present study evaluates five boosting algorithms — XGBoost, AdaBoost, Gradient Boosting, LightGBM, and CatBoost — using a publicly accessible medical claims dataset from the Cimas Medical Aid Society, Zimbabwe, comprising 24,084 patient records and 11 predictor variables. The dataset exhibits moderate class imbalance (59.85% non-adherent; 40.15% adherent). The experimental pipeline included data cleaning, stratified 80:20 splitting, class-weight calibration, uniform baseline hyperparameters (n_estimators = 100, learning_rate = 0.1), 10-fold stratified cross-validation, and Wilcoxon signed-rank statistical testing. LightGBM outperformed all competing models, achieving an accuracy of 0.8163, AUC-ROC of 0.9044, F1-scores of 0.8007 (adherent) and 0.8296 (non-adherent), and a Matthews Correlation Coefficient of 0.6540, with cross-validation confirming stability (0.8147 ± 0.0069). Feature importance analysis identified Annual Claim Amount, Units Total, and Age as the most informative predictors. This work delivers the first empirical benchmark of five contemporary boosting ensembles for NCD medication adherence prediction, integrating class-weighted training and statistical validation within a unified framework, offering actionable guidance for model selection in resource-limited clinical settings.