Aekwarangkoon, Saifon
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Solving missing categorical data in questionnaire responses for automated classification Aekwarangkoon, Saifon; Namponwatthanakul, Thanatep; Amonwet, Adisorn; Hemtanon, Siranuch
Bulletin of Electrical Engineering and Informatics Vol 14, No 4: August 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v14i4.8785

Abstract

Handling missing categorical data is critical for maintaining the accuracy and reliability of automatic classification tasks, particularly in mental health screening based on questionnaire responses. This study investigates several imputation methods, including last observation carried forward (LOCF), k-nearest neighbor (KNN) imputation, hot-deck imputation, and multivariate imputation by chained equations (MICE). Results show that KNN imputation achieves the lowest root mean square error (RMSE), indicating the most faithful reconstruction of the original data. However, for classification performance, MICE-imputed datasets produced models that outperformed those generated by other methods and even surpassed models trained on the original incomplete data. Interestingly, we also found that using observed data over multiple iterations of imputation tuning can introduce greater deviation from original missing values, but this process can help form datasets with clearer class boundaries, ultimately improving classification accuracy. These findings emphasize the need to balance data fidelity and model performance when selecting imputation strategies.