Building of Informatics, Technology and Science
Vol 7 No 4 (2026): March 2026

Evaluasi KNN dan Logistic Regression untuk Klasifikasi Diabetes dengan Preprocessing Terstandarisasi: Trade-off Kinerja dan Interpretabilitas

Alif Zayyin Kamandani (Universitas Dian Nuswantoro, Semarang)
Egia Rosi Subhiyakto (Universitas Dian Nuswantoro, Semarang)



Article Info

Publish Date
31 Mar 2026

Abstract

Although K-Nearest Neighbors (KNN) and Logistic Regression have been widely used in diabetes classification, studies that systematically combine a standardized preprocessing pipeline—including median imputation, feature standardization, and stratified data splitting—and evaluate the trade-off between predictive performance and model interpretability remain limited. This study aims to compare the performance of both algorithms in classifying diabetes status using the Pima Indians Diabetes dataset, which consists of 768 samples with eight numerical attributes. The research stages include data exploration, handling missing values using median imputation, feature standardization using StandardScaler, and stratified data splitting with a ratio of 80:20. Model evaluation is conducted using accuracy, precision, recall, F1-score, confusion matrix, and ROC-AUC metrics. The experimental results show that KNN with an optimal parameter of K=21 achieves an accuracy of 75.97%, an F1-score of 61.86%, and a ROC-AUC of 0.8120, while Logistic Regression achieves an accuracy of 70.78%, an F1-score of 54.55%, and a ROC-AUC of 0.8130. Although KNN demonstrates higher predictive performance, Logistic Regression provides advantages in interpretability through model coefficients, where the variables Glucose (β=1.1825) and BMI (β=0.6887) are identified as the main predictors of diabetes risk. These findings indicate a clear trade-off between accuracy and interpretability, suggesting that KNN is more suitable for high-accuracy prediction tasks, while Logistic Regression is more appropriate in clinical contexts requiring transparency and model accountability.

Copyrights © 2026






Journal Info

Abbrev

bits

Publisher

Subject

Computer Science & IT

Description

Building of Informatics, Technology and Science (BITS) is an open access media in publishing scientific articles that contain the results of research in information technology and computers. Paper that enters this journal will be checked for plagiarism and peer-rewiew first to maintain its quality. ...