Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : JOURNAL OF APPLIED INFORMATICS AND COMPUTING

Optimizing XGBoost for Heart Disease Risk Classification Using Optuna and Random Search on the Behavioral Risk Factor Surveillance System (BRFSS) 2023 Dataset Dzaky, Muhammad; Kuncoro, Adam Prayogo; Riyanto, Riyanto
Journal of Applied Informatics and Computing Vol. 10 No. 1 (2026): February 2026
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v10i1.11897

Abstract

Heart disease is a critical public health issue in Indonesia, contributing to approximately 1,5 million deaths annually. Although machine learning methods, particularly Extreme Gradient Boosting (XGBoost), have demonstrated strong performance in medical classification tasks, their optimization on large-scale and highly imbalanced health datasets remains underexplored. This study optimizes XGBoost for heart disease risk classification using the Behavioral Risk Factor Surveillance System (BRFSS) 2023 dataset, consisting of 290.156 samples after preprocessing. Two hyperparameter optimization approaches, Optuna and Random Search, are evaluated across three class imbalance handling techniques, namely class weighting, SMOTE, and Random Undersampling (RUS). Model evaluation focuses on AUC and recall to prioritize sensitivity in identifying individuals at risk. The results show that the OptunaRUS and RandomWeight models achieve the most stable performance, with OptunaRUS attaining an AUC of 83,06% and a recall of 75,69% on the test dataset. Feature importance analysis indicates that age range and hypertension are the most influential predictors. These findings confirm that hyperparameter optimization on large-scale health data improves model discriminative capability and generalization, while selective sampling strategies such as RUS provide more stable performance than generative methods in high-dimensional datasets.