Cardiovascular disease (CVD) remains one of the leading causes of death globally, underscoring the need for effective early risk prediction. This systematic literature review analyzes research published between 2013 and 2023 on the application of machine learning (ML) in CVD risk prediction. Key areas examined include feature selection, data preprocessing, algorithm choice, and model evaluation. Studies were selected from ACM Digital Library, IEEE Xplore, ScienceDirect, and Scopus based on predefined research questions. Common challenges include limited or low-quality datasets, inconsistent preprocessing methods, and the need for clinically interpretable models. Widely used algorithms include random forest (RF), support vector machine (SVM), decision tree (DT), logistic regression (LR), naïve Bayes (NB), k-nearest neighbor (K-NN), and extreme gradient boosting (XGBoost). The review highlights that robust preprocessing, optimal feature selection, and thorough model validation significantly improve predictive accuracy. It also emphasizes the importance of balancing performance with interpretability for clinical adoption. Finally, the study proposes a structured framework to guide future research and practical implementation, including the integration of genetic and behavioral data to support more personalized and effective cardiovascular care.
Copyrights © 2026