Elevated glucose levels in the circulation are indicative of diabetes, a chronic medical condition. Prolonged unregulated blood glucose levels pose a significant risk of severe consequences, including renal failure, myocardial infarction, and lower limb amputation. The objective of this study is to conduct a comparative analysis of SVM, Naive Bayes, XGBoost, Random Forest, and ANN models in order to forecast the occurrence of diabetes. The research methodology comprises seven primary stages: (1) literature review, (2) data collection, (3) exploratory data analysis (EDA), (4) data preprocessing, (5) feature selection, (6) model development, and (7) model evaluation and comparison. The XGBoost model is the most suitable option, as indicated by the model evaluation results. The XGBoost model achieved a precision of 0.88, a recall of 0.87, and an accuracy of 0.8690. The XGBoost model has a RMSE of 0.3620 and a MSE of 0.1310.
Copyrights © 2025