Diabetes mellitus is a chronic metabolic disease with a continuously increasing global prevalence. Early detection of diabetes risk is crucial to reduce long-term health complications and the associated healthcare costs. However, a major challenge in applying machine learning models to medical data is the issue of class imbalance, which can lead to model bias toward the majority class. This study aims to develop a diabetes risk classification model by integrating the K-Nearest Neighbors (KNN) algorithm with the Adaptive Synthetic Sampling (ADASYN) technique to address the class imbalance problem. The dataset used was obtained from the Kaggle platform, containing 2,000 patient samples with nine predictive features. Data preprocessing was performed through missing value imputation, outlier handling using winsorizing, and feature normalization using StandardScaler. ADASYN was applied to generate adaptive synthetic samples for the minority class, and the KNN model was trained and evaluated using confusion matrix, precision, recall, F1-Score, accuracy, and ROC-AUC metrics. The results indicate that the implementation of ADASYN improved the ROC-AUC Score by 5.48% (from 91.34% to 96.82%) and the overall accuracy by 2.50% (from 81.50% to 84.00%). The F1-Score for the Diabetes class also increased by 0.40%. The integration of KNN and ADASYN has proven effective in enhancing model performance for detecting high-risk diabetes patients and improving sensitivity toward the minority class.
                        
                        
                        
                        
                            
                                Copyrights © 2025