Silkan, Hassan
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

A method for missing values imputation of machine learning datasets Hanyf, Youssef; Silkan, Hassan
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 13, No 1: March 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v13.i1.pp888-898

Abstract

In machine learning applications, handling missing data is often required in the pre-processing phase of datasets to train and test models. The class center missing value imputation (CCMVI) is among the best imputation literature methods in terms of prediction accuracy and computing cost. The main drawback of this method is that it is inadequate for test datasets as long as it uses class centers to impute incomplete instances because their classes should be assumed as unknown in real-world classification situations. This work aims to extend the CCMVI method to handle missing values of test datasets. To this end, we propose three techniques: the first technique combines the CCMVI with other literature methods, the second technique imputes incomplete test instances based on their nearest class center, and the third technique uses the mean of centers of classes. The comparison of classification accuracies shows that the second and third proposed techniques ensure accuracy close to that of the combination of CCMVI with literature imputation methods, namely k-nearest neighbors (KNN) and mean methods. Moreover, they significantly decrease the time and memory space required for imputing test datasets.