Journal of Sustainable Software Engineering and Information Systems
Vol. 2 No. 1 (2026): Journal of Sustainable Software Engineering and Information Systems

Comparative Analysis of Machine Learning Algorithms for Diabetes Prediction with Feature and Hyperparameter Optimization

Fikri Fakhar Rahmadhan (Universitas Muhammadiyah Kotabumi)
Fikri Haikal (Universitas Muhammadiyah Kotabumi)
Muhammad Arif (Universitas Muhammadiyah Kotabumi)
Muhammad Agung Insani (Universitas Muhammadiyah Kotabumi)



Article Info

Publish Date
02 May 2026

Abstract

Background: Diabetes is a chronic disease with increasing global prevalence, making early detection essential. Machine learning has shown strong potential in improving prediction accuracy; however, robust validation and systematic optimization are still required. Aims: This study tries to compare different machine learning methods to predict diabetes using a. reproducible and methodologically sound framework. Methods: The Pima Indian Diabetes dataset (768 samples, 8 clinical features) was used. Six algorithms were evaluated: Logistic Regression, K-Nearest Neighbors, Decision Tree, Random Forest, Support Vector Machine, and Gradient Boosting. Hyperparameter tuning was done with GridSearchCV, and the models were checked using stratified 5-fold cross-validation. The performance of the model was assessed using several metrics including accuracy, precision, recall, F1-score, and AUC-ROC. Results: The results show that ensemble methods outperform traditional models. Random Forest achieved the highest The model performed with an accuracy of 98% plus or minus 1.8% and an AUC-ROC of 0.999 plus or minus 0.02, then Gradient Boosting achieved 91% plus or minus 2.1%. Logistic Regression and KNN had lower performance with accuracy scores of 79% plus or minus 2.3% and 77% plus or minus 2.5%, respectively. The analysis of which features are most important found that glucose levels, BMI, and age are the top factors that have the biggest influence. Conclusion: The study demonstrates that ensemble methods combined with hyperparameter optimization and robust validation significantly improve diabetes prediction performance and can support clinical decision-making.

Copyrights © 2026






Journal Info

Abbrev

jsseis

Publisher

Subject

Description

Journal of Sustainable Software Engineering and Information Systems (JSSEIS) focuses on the critical intersection of Software Engineering, Information Systems, and Sustainability and the Environment. The journal aims to advance knowledge and practices in leveraging the principles and methodologies ...