Heart disease remains a leading cause of mortality worldwide, underscoring the need for early and accurate diagnosis to reduce complications and improve patient outcomes. Recent advances in machine learning have enabled the development of predictive models that assist healthcare professionals in disease detection using patient medical records. This study aims to develop and compare the performance of Extreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (LightGBM) for heart disease prediction. The dataset used in this research was obtained from the UCI Machine Learning Repository and consists of 303 patient records with binary class labels indicating the presence or absence of heart disease. Data preprocessing involved feature standardization using StandardScaler and handling class imbalance through the Synthetic Minority Over-sampling Technique (SMOTE). Model evaluation was conducted using Stratified K-Fold Cross Validation with K values of 3, 5, and 7 to ensure robust and unbiased performance assessment. Hyperparameter optimization was carried out using RandomizedSearchCV to efficiently identify optimal model configurations. Experimental results indicate that both XGBoost and LightGBM achieved strong classification performance, with accuracy exceeding 80% and AUC values above 0.89. LightGBM demonstrated slightly superior performance in terms of average accuracy, F1-score, and stability across folds, while XGBoost achieved higher precision, reflecting better control of false positives. Overall, both algorithms are effective for heart disease prediction, supporting the potential of machine learning in early disease detection and clinical decision-support systems.
Copyrights © 2025