CAUCHY: Jurnal Matematika Murni dan Aplikasi
Vol 11, No 1 (2026): CAUCHY: JURNAL MATEMATIKA MURNI DAN APLIKASI

Coronary Heart Disease Risk Prediction under Class Imbalance Using XGBoost with SHAP-Based Interpretation

Amiroch, Siti (Unknown)
Laili, Fitri Nur (Unknown)
Rohmah, Awawin Mustana (Unknown)
Kardono, Dicka Yale (Unknown)



Article Info

Publish Date
30 May 2026

Abstract

Coronary heart disease (CHD) risk prediction is challenging because clinical data are heterogeneous and the response variable is imbalanced. This study develops an interpretable predictive framework for CHD risk using Extreme Gradient Boosting (XGBoost), median imputation, IQR-based winsorization, standardization, the Synthetic Minority Over-sampling Technique (SMOTE), bootstrap-based uncertainty assessment, and Shapley Additive Explanations (SHAP). The learning problem is formulated within a regularized empirical risk minimization framework, so the model is viewed as a statistical estimator rather than merely an algorithmic classifier. To avoid information leakage, train–test splitting is performed before any resampling, and SMOTE is applied only to the training data. The primary analysis is fixed a priori at an 80:20 stratified split, whereas 60:40 and 70:30 splits are treated as sensitivity analyses rather than model-selection devices. In the primary analysis, the model attains accuracy of 79.36%, precision of 27.88%, recall of 22.48%, F1-score of 24.89%, and ROC–AUC of 0.6502. The 95% bootstrap confidence interval for ROC–AUC is [0.6017, 0.6981]. SHAP analysis in probability space identifies age, cigsPerDay, male, heartRate, and sysBP as the most influential predictors. These results show that the proposed framework is mathematically well-structured and interpretable, but that its out-of-sample discrimination on this dataset is moderate rather than high.

Copyrights © 2026






Journal Info

Abbrev

Math

Publisher

Subject

Mathematics

Description

Jurnal CAUCHY secara berkala terbit dua (2) kali dalam setahun. Redaksi menerima tulisan ilmiah hasil penelitian, kajian kepustakaan, analisis dan pemecahan permasalahan di bidang Matematika (Aljabar, Analisis, Statistika, Komputasi, dan Terapan). Naskah yang diterima akan dikilas (review) oleh ...