Heart disease is one of the leading causes of death in the world with risk factors such as atherosclerosis, high blood pressure, and smoking. Early diagnosis is essential to reduce mortality and improve patients' quality of life. This study evaluates the performance of two machine learning algorithms, namely Support Vector Machine (SVM) and Decision Tree (DT), in predicting heart disease risk by applying undersampling techniques to handle data imbalance. The K-fold cross-validation method with K=10 and hyperparameter tuning were applied to obtain the optimal performance of both models. The results showed that SVM without undersampling achieved 92% accuracy, while with undersampling the accuracy decreased to 76%. DT without undersampling has 91% accuracy, while with undersampling the accuracy reaches 75%. The undersampling technique successfully improved the balance in recognizing minority classes, although it reduced the overall accuracy. This finding confirms that SVM is more reliable in predicting heart disease in datasets with unbalanced class distribution.
Copyrights © 2025