This study aims to compare the performance of various machine learning algorithms and ensemble methods in predicting heart disease, using two different datasets: datasets from the UCI Machine Learning Repository and Kaggle. Nine algorithms were tested, including Logistic Regression (LR), K-Nearest Neighbors (KNN), Decision Tree (DT), Random Forest (RF), XGBoost, LightGBM, CatBoost, Support Vector Machine (SVM), and Naive Bayes (NB). The data were processed through data cleaning, normalization, and splitting the dataset into training and test data. The experimental results showed that K-Nearest Neighbors (KNN) performed best with an accuracy of 91.80%, followed by Support Vector Machine (SVM) and Random Forest (RF), which also demonstrated stable and effective results in handling complex datasets. Although Decision Tree (DT) and Naive Bayes (NB) performed lower, these results demonstrate that basic machine learning algorithms can provide adequate results for heart disease classification. This study recommends the use of ensemble algorithms and further exploration in feature engineering to improve predictions.
Copyrights © 2022