Mohd Fauzi, Abdullah Munzir
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Comparative Analysis of Robust Imputation Techniques for Enhancing Cervical Cancer Prediction with Missing Data Mizan, Muhammad Thaqiyuddin; Ernawan, Ferda; Kasim, Shahreen; Erianda, Aldo; Mohd Fauzi, Abdullah Munzir
JOIV : International Journal on Informatics Visualization Vol 9, No 5 (2025)
Publisher : Society of Visual Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62527/joiv.9.5.4501

Abstract

Handling missing data is a critical challenge in machine learning applications, as it can significantly affect the accuracy and reliability of predictive models. Addressing this issue is crucial for developing robust systems that can deliver high-performance results. This study provides a comparative analysis of the robust imputation technique for cervical cancer prediction with incomplete information. This study has investigated the importance of robust imputation techniques, particularly Soft Imputer, in addressing missing data challenges and enhancing model performance. This study investigates the impact of various imputations across five distinct approaches: KNN imputer, PCA imputer, MICE imputer, XGBoost imputer, LightGBM imputer, and feature selection methods. These imputation data are tested on several machine learning models such as Random Forest (RF), Extreme Gradient Boosting (XGB), Decision Tree (DT), Support Vector Classifier (SVC), Logistic Regression (LR), Extra Trees Classifier (ETC), CatBoost Classifier, Stochastic Gradient Descent (SGD), and Gradient Boosting (GB) for improving classification accuracy of cervical cancer prediction. The evaluation reveals that the soft imputer method achieves a balanced and effective handling of missing data, significantly improving the reliability of the models. Among the tested methods, LightGBM and XGBoost deliver strong results, each achieving an average accuracy of 96.91%. MICE demonstrated the lowest average accuracy at 95.94%, although it still performs reliably in managing missing data. The findings provide valuable insights for enhancing predictive accuracy in future work by integrating advanced imputation strategies for high-dimensional and complex datasets.