Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : Jurnal Teknik Informatika (JUTIF)

Systematic Optimization of Ensemble Learning for Heart Failure Survival Prediction using SHAP and Optuna Setia, Bayu; Zaky, Umar
Jurnal Teknik Informatika (Jutif) Vol. 6 No. 5 (2025): JUTIF Volume 6, Number 5, Oktober 2025
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2025.6.5.5324

Abstract

Heart failure (HF) stands as a major global health problem where precise and early prediction of patient prognosis is essential for improving clinical management and patient care. A common obstacle for standard machine learning models in this domain is the prevalent issue of class imbalance within clinical datasets. To overcome this challenge, this study introduces a systematically optimized ensemble learning model for the accurate classification of patient survival. The methodology was applied to a publicly accessible clinical dataset of 299 heart failure patients. Its comprehensive framework included logarithmic transformation, stratified data splitting (80:20), SHAP-based selection of eight key features, and hyperparameter tuning with Optuna over 75 trials, with the specific objective of maximizing the F1-score using 10-fold cross-validation. The performance of three ensemble models (Random Forest, XGBoost, and LightGBM) was refined using decision threshold tuning. The results revealed that the fully optimized Random Forest model yielded superior outcomes, attaining an accuracy of 96.67%, an F1-score of 0.9474, and precision and recall values of 0.95, demonstrating high reliability with only a single instance of a False Negative and False Positive. The study concludes that the systematic application of SHAP, SMOTE, and Optuna within an ensemble framework substantially improves classification performance for imbalanced HF data, surpassing existing benchmarks. This work thus provides a replicable and systematic framework for developing reliable machine learning models from complex, imbalanced medical datasets, contributing a valuable methodology to the field of computational science.