This study evaluates the effectiveness of data mining algorithms in heart failure disease classification. Various algorithms, including Random Forest, Decision Tree C4.5, Gradient Boosted Machine (GBM), and XGBoost, were applied to a heart failure dataset. The dataset was collected from multiple sources and preprocessed to address imbalances using the SMOTE (Synthetic Minority Over-sampling Technique) technique. The results indicate that employing SMOTE and parameter optimization through grid search significantly enhances the performance of these algorithms. XGBoost and GBM demonstrated superior accuracy, precision, and recall in both balanced and imbalanced data scenarios. In balanced data scenarios, XGBoost achieved an accuracy of 98.75% with an error rate of 1.25%, while GBM achieved an accuracy of 98.60% with an error rate of 1.40%. The study confirms that appropriate data preprocessing and parameter optimization are crucial for improving the accuracy of medical data analysis. These findings suggest that XGBoost and GBM are highly effective for heart disease prediction, supporting early diagnosis and timely medical intervention. Future research should explore alternative preprocessing techniques and additional algorithms to further improve prediction outcomes.
Copyrights © 2024