Heart disease can be detected early by identifying risk factors that may contribute to its development. The Farmingham Study has conducted research on these risk factors. Machine learning models can be applied to perform early detection automatically based on data from the study. The obtained data is then processed through several pre-processing stages to prepare it for use in the modeling process. Afterward, models are built using the Random Forest, Logistic Regression, and K-Nearest Neighbor algorithms. Models built with individual algorithms show quite good performances, with the highest accuracy value of 0.91 for the Random Forest algorithm and the lowest accuracy of 0.67 for the Logistic Regression algorithm. Ensemble learning techniques such as the Voting Classifier and Stacking Classifier techniques are applied in this study to improve accuracy. The stacking technique successfully increased accuracy to 0.92. However, the voting technique does not outperform the Random Forest model. This is because the voting technique is more suitable for combining algorithms with balanced performance, whereas in this study, the Random Forest and Logistic Regression models have a significant difference in performance.
Copyrights © 2025