Journal of Mathematics, Computation and Statistics (JMATHCOS)
Vol. 8 No. 2 (2025): Volume 08 Nomor 02 (Oktober 2025)

Effect of Feature Normalization and Distance Metrics on K-Nearest Neighbors Performance for Diabetes Disease Classification

Yusran, Muhammad (Unknown)
Sadik, Kusman (Unknown)
Soleh, Agus M (Unknown)
Suhaeni, Cici (Unknown)



Article Info

Publish Date
11 Sep 2025

Abstract

Diabetes is a global health issue with a steadily increasing prevalence each year. Early detection of the disease is an important step in preventing severe complications. The K-Nearest Neighbors (KNN) algorithm is often used in disease classification, but its performance is highly influenced by the choice of normalization method and distance metric used. This study aims to evaluate the effect of various normalization methods and distance metrics on the performance of the KNN algorithm in diabetes disease classification. The three normalization methods were employed: z-score normalization, min-max scaling, and median absolute deviation (MAD). In addition, the seven distance metrics were assessed: Euclidean, Manhattan, Chebyshev, Canberra, Hassanat, Lorentzian, and Clark. The dataset used is Pima Indians Diabetes which consists of 768 observations and 8 features. The data were split into 80% training data and 20% test data, and using 5-fold cross-validation to determine the optimal k value. The results show that the MAD-Canberra combination produces the highest overall accuracy, recall, and F1-score of 87.32%, 82.33%, and 81.94%, respectively. The highest precision was obtained from the Baseline-Hassanat combination at 86.96%, while the lowest performance was observed for the Z-Score-Chebyshev combination with F1-Score 58.02%. These results highlight that no single combination universally outperforms others, underscoring the need for empirical evaluation. Nonetheless, combining MAD normalization with metrics such as Canberra or Hassanat can serve as a strong starting point for developing KNN-based classification systems, especially in medical contexts that are sensitive to misclassification.

Copyrights © 2025






Journal Info

Abbrev

JMATHCOS

Publisher

Subject

Mathematics

Description

Fokus yang didasarkan tidak hanya untuk penelitian dan juga teori-teori pengetahuan yang tidak menerbitkan plagiarism. Ruang lingkup jurnal ini adalah teori matematika, matematika terapan, program perhitungan, perhitungan matematika, statistik, dan statistik ...