Al-Dayyeni, Wissam
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Improving Diabetes Prediction Performance Using Random Forest Classifier with Hyperparameter Tuning Anggreini, Novita Lestari; Yuliana, Ade; Ramdan, Dadan Saepul; Al-Dayyeni, Wissam
Jurnal Teknik Informatika (Jutif) Vol. 6 No. 4 (2025): JUTIF Volume 6, Number 4, Agustus 2025
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2025.6.4.4755

Abstract

Diabetes mellitus is a chronic metabolic disorder that poses a serious challenge to global healthcare systems due to its increasing prevalence and the high costs associated with treatment. Although machine learning has been widely adopted to support early diagnosis, many predictive models still underperform due to limited preprocessing strategies and inefficient hyperparameter settings. This study proposes a comprehensive machine learning pipeline to enhance diabetes prediction accuracy by utilizing a Random Forest classifier optimized through systematic hyperparameter tuning. The novelty of this method lies in its integrated approach, which includes thorough preprocessing such as removing duplicate records, handling inconsistent unique values, addressing missing data, and applying the SMOTE technique to overcome class imbalance. Additionally, hyperparameter tuning is conducted using GridSearchCV combined with 5-fold cross-validation, and only the most influential features are selected to improve model interpretability and efficiency. The proposed model achieved an accuracy of 95 percent, with a recall of 0.88 and an F1-score of 0.85, indicating its robustness in identifying diabetic cases more effectively than previous studies using standard machine learning algorithms. This model contributes to the development of a reliable and scalable early detection system for diabetes, applicable in clinical decision support environments. Further refinement can be achieved by testing on larger and more diverse datasets or by implementing more efficient tuning techniques such as Bayesian optimization.