Heart disease is one of the leading causes of death worldwide, making data-driven early detection crucial for supporting medical decision-making systems. A major challenge in developing heart disease prediction models is dataset quality, including the often imbalanced class distribution, which can impact the performance of classification algorithms. This study aims to analyze the effect of the Synthetic Minority Oversampling Technique (SMOTE) on the performance of three classification algorithms: Support Vector Classifier (SVC), Random Forest (RF), and K-Nearest Neighbor (KNN). The dataset used is heart_disease50.csv with 4,001 patient data consisting of 21 predictor attributes and one target variable (heart disease status: “Yes” or “No”) with a relatively balanced class distribution. The research process includes data preprocessing (data cleaning, normalization, and encoding), data partitioning using Stratified K-Fold Cross Validation (k=5), applying SMOTE to training data, building a classification model, and evaluation using accuracy, precision, recall, F1-score, and AUC-ROC metrics. The results showed that applying SMOTE did not always improve performance. The SVC model with SMOTE experienced a decrease in accuracy (0.4819) compared to the one without SMOTE (0.5106), while Random Forest remained relatively stable with insignificant differences (0.4669 without SMOTE and 0.4644 with SMOTE). KNN with SMOTE emerged as the best model with an accuracy of 0.5268 and a precision of 0.5271, although the AUC-ROC remained the same as KNN without SMOTE (0.5135). Overall, these results confirm that the effectiveness of SMOTE is highly dependent on dataset conditions, and in cases with relatively balanced data, SMOTE does not provide significant benefits. Therefore, improving the performance of heart disease prediction classification is recommended through hyperparameter optimization strategies, relevant feature selection, or the use of more sophisticated algorithms such as Gradient Boosting or Neural Networks.
Copyrights © 2025