Claim Missing Document
Check
Articles

Found 1 Documents
Search

Diabetes Disease Prediction on Unbalanced Data Using SMOTE-Tomek Links and Random Forest Algorithm Sukamto, Titis Fatmah; Prameswary, Cathy Lintang; Royadi, Dedi; Sofia, Detin
G-Tech: Jurnal Teknologi Terapan Vol 9 No 3 (2025): G-Tech, Vol. 9 No. 3 July 2025
Publisher : Universitas Islam Raden Rahmat, Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.70609/g-tech.v9i3.7164

Abstract

Diabetes, a chronic condition caused by the body’s incapacity to generate or apply insulin as it should, is characterized by elevated blood sugar. If not treated early, the disease can lead to serious complications. This research aims to implement a machine learning-based classification model to predict diabetes, applying the methodology known as CRISP-DM (Cross-Industry Standard Process for Data Mining). The dataset was obtained from the Health Center of Sukatani Village, Rajeg, with a total of 2,075 records and 21 columns. The SMOTE-Tomek Links resampling technique was used to resolve the data’s class imbalance. Five classification algorithms, Naive Bayes, Random Forest, Logistic Regression, Decision Tree, and K-Nearest Neighbor (KNN), were compared in this study. Experiments revealed that the Random Forest algorithm performed the best with 97% accuracy, which increased to 99.64% after the application of SMOTE-Tomek Links. This best model was implemented in a web-based application using the Streamlit framework. The combination of the CRISP-DM approach, Random Forest algorithm, and SMOTE-Tomek Links proved to be effective in predicting diabetes, so that it can help medical personnel and the community in preventing, managing, and monitoring diabetes optimally.