Muhammad Muhammad
Department of Informatics, Universitas Ahmad Dahlan

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

A Comparative Study of K-Means and KNN Imputation for Handling Missing Data in Scholarship Applicant Datasets Muhammad Muhammad; Tole Sutikno; Imam Riadi
JUITA: Jurnal Informatika JUITA Vol. 13 Issue 3, November 2025
Publisher : Department of Informatics Engineering, Universitas Muhammadiyah Purwokerto

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30595/juita.v13i3.26502

Abstract

Handling missing values is a key issue in data processing, especially in financial records of prospective scholarship recipients where precision is vital for effective decision making. This research aims to analyze the effectiveness of two commonly used imputation methods, namely K-Nearest Neighbors (KNN) and K-Means, in filling missing values across key attributes such as Semester, Grade Point Average (GPA), number of dependents, number of credits, and parental income. Performance evaluation was conducted using Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE). The results indicate that KNN generally provides more stable and accurate imputations, particularly for attributes with homogeneous distributions such as Semester and GPA, while K-Means demonstrates competitive performance on attributes with higher variability, provided that the number of clusters is optimally defined. Nonetheless, K-Means tends to be more sensitive to increasing proportions of missing data. These findings underscore the importance of selecting imputation methods that align with attribute distribution characteristics and the extent of missing data in order to develop reliable predictive models, as observed in scenarios with 15% and 25% missing data. The findings can also serve as a reference for developing more accurate scholarship selection processes in the presence of incomplete financial data.