Handling missing data is a critical challenge in machine learning applications, as it can significantly affect the accuracy and reliability of predictive models. Addressing this issue is crucial for developing robust systems that can deliver high-performance results. This study provides a comparative analysis of the robust imputation technique for cervical cancer prediction with incomplete information. This study has investigated the importance of robust imputation techniques, particularly Soft Imputer, in addressing missing data challenges and enhancing model performance. This study investigates the impact of various imputations across five distinct approaches: KNN imputer, PCA imputer, MICE imputer, XGBoost imputer, LightGBM imputer, and feature selection methods. These imputation data are tested on several machine learning models such as Random Forest (RF), Extreme Gradient Boosting (XGB), Decision Tree (DT), Support Vector Classifier (SVC), Logistic Regression (LR), Extra Trees Classifier (ETC), CatBoost Classifier, Stochastic Gradient Descent (SGD), and Gradient Boosting (GB) for improving classification accuracy of cervical cancer prediction. The evaluation reveals that the soft imputer method achieves a balanced and effective handling of missing data, significantly improving the reliability of the models. Among the tested methods, LightGBM and XGBoost deliver strong results, each achieving an average accuracy of 96.91%. MICE demonstrated the lowest average accuracy at 95.94%, although it still performs reliably in managing missing data. The findings provide valuable insights for enhancing predictive accuracy in future work by integrating advanced imputation strategies for high-dimensional and complex datasets.
                        
                        
                        
                        
                            
                                Copyrights © 2025