Sinkron : Jurnal dan Penelitian Teknik Informatika
Vol. 10 No. 1 (2026): Article Research January 2026

Heart Disease Classification Using Optimised XGBoost and Random Forest with SHAP Explanations

Hutagalung, Pancar Hizkia (Unknown)
Andrianingsih, Andrianingsih (Unknown)



Article Info

Publish Date
03 Jan 2026

Abstract

Heart disease remains one of the leading causes of global morbidity, creating a need for accurate and interpretable computational tools to support early diagnosis. However, many existing studies on the Cleveland Heart Disease dataset rely on limited validation protocols, apply only a single hyperparameter optimisation strategy, or provide narrow explainability analyses, which can lead to optimistic performance estimates and inconsistent clinical insight. This study addresses these gaps by proposing a classification-based prediction framework that evaluates Random Forest and XGBoost for binary heart-disease classification under three hyperparameter optimisation strategies random search, Bayesian optimisation, and particle swarm optimisation (PSO) within a nested, anti-leakage cross-validation design, while SHAP is employed to analyse model interpretability across the best-performing configurations. The experimental results show that the ensemble classifiers achieve strong and consistent performance, with ROC–AUC values ranging from 0.8908 to 0.9089 across all scenarios; Random Forest optimised with PSO obtained the highest ROC–AUC (0.9089 ± 0.0146) and F1-score (0.8188 ± 0.0206), whereas XGBoost with Bayesian optimisation reached comparable performance without statistically significant differences. SHAP analyses identified oldpeak, ca, thal, cp, thalach, and exang as the most influential features, in line with established clinical indicators of myocardial ischemia and perfusion abnormalities. These findings indicate that combining tree-based ensemble classifiers with systematic hyperparameter optimisation and SHAP-based interpretability can enhance the reliability and transparency of heart-disease classification on the Cleveland dataset, while highlighting the need for further validation on contemporary, multi-centre clinical data.

Copyrights © 2026






Journal Info

Abbrev

sinkron

Publisher

Subject

Computer Science & IT

Description

Scope of SinkrOns Scientific Discussion 1. Machine Learning 2. Cryptography 3. Steganography 4. Digital Image Processing 5. Networking 6. Security 7. Algorithm and Programming 8. Computer Vision 9. Troubleshooting 10. Internet and E-Commerce 11. Artificial Intelligence 12. Data Mining 13. Artificial ...