Handling missing values is a key issue in data processing, especially in financial records of prospective scholarship recipients where precision is vital for effective decision making. This research aims to analyze the effectiveness of two commonly used imputation methods, namely K-Nearest Neighbors (KNN) and K-Means, in filling missing values across key attributes such as Semester, Grade Point Average (GPA), number of dependents, number of credits, and parental income. Performance evaluation was conducted using Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE). The results indicate that KNN generally provides more stable and accurate imputations, particularly for attributes with homogeneous distributions such as Semester and GPA, while K-Means demonstrates competitive performance on attributes with higher variability, provided that the number of clusters is optimally defined. Nonetheless, K-Means tends to be more sensitive to increasing proportions of missing data. These findings underscore the importance of selecting imputation methods that align with attribute distribution characteristics and the extent of missing data in order to develop reliable predictive models, as observed in scenarios with 15% and 25% missing data. The findings can also serve as a reference for developing more accurate scholarship selection processes in the presence of incomplete financial data.
Copyrights © 2025