Heart disease is one of the leading causes of death worldwide, requiring accurate predictive methods to support early detection and clinical decision making. This study aims to analyze and compare the performance of three supervised machine learning algorithms, namely K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Random Forest (RF), in classifying heart disease using the Cleveland Heart Disease dataset consisting of 303 patient records with 13 clinical features. The research stages include data preprocessing, splitting the dataset into 80% training data and 20% testing data, model training, and hyperparameter optimization using GridSearchCV with 5-fold cross-validation. After optimization, prediction was performed using test data followed by performance evaluation to assess generalization ability. Model performance was evaluated using accuracy, precision, recall, F1-score, AUC-ROC, and confusion matrix. The results show that KNN and Random Forest achieved the highest accuracy of 90.16%. The KNN model obtained a recall value of 1.0000, indicating perfect sensitivity in detecting positive cases, while Random Forest demonstrated a more balanced performance between precision and recall with the highest AUC value of 0.9481. Based on these findings, KNN is considered the most suitable model for medical screening purposes, as it successfully detected all positive heart disease patients without producing false negatives. This study is expected to serve as a reference for implementing clinical databased machine learning as a decision support tool for early heart disease detection.
Copyrights © 2026