Diabetes is a chronic disease with an increasing global prevalence, posing a serious threat to public health. This study aims to compare the performance of three classification algorithms—Logistic Regression, Decision Tree, and Support Vector Machine (SVM)—in predicting diabetes risk using secondary data from Kaggle. A quantitative approach was used, with model performance evaluated based on accuracy. Results show that SVM achieved the highest accuracy at 74.46%, followed by Logistic Regression at 73.59%, and Decision Tree at 70.56%. SVM excels in handling high-dimensional data and variability, while Logistic Regression is easier to interpret. Although Decision Tree is intuitive and easy to visualize, it is more prone to overfitting. These findings suggest that SVM is the most suitable algorithm for data-driven diabetes prediction, supporting the development of early detection systems that are fast, efficient, and cost-effective.
Copyrights © 2025