Jurnal Teknik Informatika (JUTIF)
Vol. 7 No. 2 (2026): JUTIF Volume 7, Number 2, April 2026

Optimizing Heart Disease Classification Using C4.5, Random Forest, and XGBoost with ANOVA, Chi-Square, and AdaBoost

Pratama, Andika (Unknown)
Assegaff, Setiawan (Unknown)
Jasmir, Jasmir (Unknown)
Nurhadi, Nurhadi (Unknown)



Article Info

Publish Date
15 Apr 2026

Abstract

Heart disease remains one of the leading causes of mortality worldwide, underscoring the need for accurate and scalable prediction models within clinical informatics. This study proposes a leakage-safe machine learning pipeline combining stratified splitting, SMOTE-based imbalance handling, and in-fold feature selection using ANOVA, Chi-Square, and AdaBoost-assisted ranking to enhance classification performance on a large heart-disease dataset consisting of 10,000 samples and 21 attributes. Three widely used algorithms, C4.5, Random Forest, and XGBoost, were evaluated to determine the optimal model-feature selection configuration for structured medical data. The results demonstrate that feature relevance contributes more significantly to predictive performance than increasing model complexity, with Random Forest achieving the highest accuracy, precision, recall, and F1-Score at 98.43% when combined with Chi-Square or ANOVA feature selection. C4.5 showed the greatest relative improvement, rising from 76.52% to 97.57% using AdaBoost-assisted selection, while XGBoost improved from 66.32% to 94.88% after statistical filtering. The dominant features identified such as CRP, BMI, blood pressure, fasting glucose, LDL, triglycerides, and homocysteine align with well-established cardiovascular biomarkers, supporting clinical validity. This research provides an important contribution to computer science by demonstrating an efficient and scalable hybrid FS-boosting framework capable of reducing unnecessary model complexity, improving generalization, and supporting low-latency deployment in clinical decision-support systems. The findings highlight the potential of structured-data machine learning to strengthen digital health diagnostics in resource-limited environments.

Copyrights © 2026






Journal Info

Abbrev

jurnal

Publisher

Subject

Computer Science & IT

Description

Jurnal Teknik Informatika (JUTIF) is an Indonesian national journal, publishes high-quality research papers in the broad field of Informatics, Information Systems and Computer Science, which encompasses software engineering, information system development, computer systems, computer network, ...