Heart disease remains one of the leading causes of death worldwide, with increasing prevalence rates, including in Indonesia. Delayed detection and diagnosis are the main challenges in treating this disease, as most cases are only identified after patients experience serious symptoms or heart attacks. Medical data often containing outliers and noise adds to the complexity of developing accurate predictive models. This study aims to develop a heart disease prediction model using a combination of the Interquartile Range (IQR) method for outlier handling and the Extreme Gradient Boosting (XGBoost) algorithm for predictive modeling. The IQR method is applied at the pre-processing stage to identify and eliminate outliers robustly without reducing data integrity, while XGBoost is used to build an efficient prediction model through an ensemble learning approach. The results showed significant improvements in model performance, with accuracy increasing from 75.41% to 89.47% and AUC-ROC from 0.8615 to 0.9450. The model demonstrates balanced predictive capabilities with precision of 95.24% and recall of 80.00% for cases without disease, and precision of 86.11% and recall of 96.88% for cases with disease. The developed model makes significant contributions by improving data quality through robust outlier handling using the IQR method, building a more accurate prediction model by leveraging the advantages of the XGBoost algorithm in the ensemble learning approach.
Copyrights © 2025