Kurniawan, Gusti Chandra
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Evaluating Synthetic Minority Oversampling Technique Strategies for Diabetes Mellitus Classification using K-Nearest Neighbors Algorithm Riadi, Imam; Yudhana, Anton; Kurniawan, Gusti Chandra
Jurnal Teknik Informatika (Jutif) Vol. 6 No. 5 (2025): JUTIF Volume 6, Number 5, Oktober 2025
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2025.6.5.5189

Abstract

Data-driven classification of Diabetes Mellitus is a crucial strategy in developing medical decision support systems that are both accurate and efficient. A major challenge in this classification task is the imbalanced class distribution, which tends to reduce the model’s sensitivity to positive cases. This research utilizes a dataset of 1,000 patient medical records obtained from the Mendeley Data repository, containing clinical attributes relevant to diabetes diagnosis. This research examines the impact of various K values on the K-Nearest Neighbors (KNN) algorithm when it is combined with the SMOTE oversampling technique to enhance classification performance. The experiment employs a 10-Fold Cross-Validation methodology with five principal assessment metrics: accuracy, precision, recall, F1-score, and Area Under Curve (AUC). Compared to prior studies, this work advances the methodology by applying SMOTE within each fold of the cross-validation process, effectively preventing data leakage and improving model generalizability. Results indicate that the K=3 configuration yields the highest F1-score of 95.13% and recall of 91.83%, while the highest AUC of 96.40% is achieved at K=9 with lower sensitivity. Applying SMOTE within each fold of the cross-validation process preserves evaluation integrity and prevents potential data leakage. The model demonstrates the ability to detect positive cases more effectively while maintaining high precision. These findings highlight that combining KNN with SMOTE and proper validation strategy is a promising approach for developing a reliable early detection system for Diabetes Mellitus that is adaptive to imbalanced clinical data.