JIPI (Jurnal Ilmiah Penelitian dan Pembelajaran Informatika)
Vol 10, No 4 (2025)

CLASSIFICATION OF CARDIOVASCULAR AND CHRONIC RESPIRATORY DISEASES UTILIZING ENSEMBLE MODELS WITH DATA EXPLORATION TECHNIQUES

I Gusti Ngurah Sentana Putra (IPB University)
Amri Luthfi Najih (IPB University)
Unique DA Resiloy (IPB University)
Rachmat Bintang Yudhianto (IPB University)
Erfiani Erfiani (IPB University)
Anwar Fitrianto (IPB University)



Article Info

Publish Date
01 Dec 2025

Abstract

Non-communicable diseases, especially cardiovascular and chronic respiratory conditions, contribute significantly to Indonesia’s healthcare burden and BPJS expenditure. Health claim data often suffer from class imbalance, multicollinearity, and outliers that impair model accuracy. This study evaluates the impact of essential data exploration techniques such as winsorizing, correlation and VIF analysis, variable selection, and SMOTE on the performance of ensemble classifiers. The dataset comprises 497,439 BPJS health insurance claims from 2022, including 27 predictors (14 numerical and 13 categorical). Two data pipelines were compared: one without preprocessing and another incorporating systematic data exploration. Five ensemble models were tested, namely Decision Tree, Extra Trees, Random Forest, XGBoost, and LightGBM. Model performance was assessed using F1-score, balanced accuracy, and G-mean across 20 stratified cross-validations. The results show that preprocessing substantially improves classification fairness and accuracy. Bagging models, particularly Random Forest, achieved the highest improvement, with balanced accuracy and G-mean increasing from around 0.93 to 0.99. Boosting models showed modest gains. These findings highlight that rigorous data exploration enhances ensemble classifier performance, enabling more reliable disease classification and supporting fairer, data-driven decision-making in BPJS health management.

Copyrights © 2025






Journal Info

Abbrev

Publisher

Subject

Computer Science & IT Education

Description

JIPI (Jurnal Ilmiah Penelitian dan Pembelajaran Informatika) e-ISSN: 2540 - 8984 was made to accommodate the results of scientific work in the form of research or papers are made in the form of journals, particularly the field of Information Technology. JIPI is a journal that is managed by the ...