Journal of Electronics, Electromedical Engineering, and Medical Informatics
Vol 8 No 1 (2026): January

A Comparative Analysis of SMOTE and ADASYN for Cervical Cancer Detection using XGBoost with MICE Imputation

Ramadhan, Mita Azzahra (Unknown)
Saragih, Triando Hamonangan (Unknown)
Kartini, Dwi (Unknown)
Muliadi, Muliadi (Unknown)
Mazdadi, Muhammad Itqan (Unknown)



Article Info

Publish Date
24 Jan 2026

Abstract

Cervical cancer remains a significant global health burden for women, with approximately 660,000 new cases and 350,000 associated deaths recorded worldwide in 2022. Machine learning methods have shown great promise in advancing timely detection and accurate diagnosis. This investigation compares two widely used oversampling strategies, Synthetic Minority Oversampling Technique (SMOTE) and Adaptive Synthetic Sampling (ADASYN), applied to cervical cancer identification via the XGBoost classifier, paired with Multiple Imputation by Chained Equations (MICE) to handle incomplete data. The dataset consists of cervical cancer risk factors with four diagnostic outcomes: Hinselmann, Schiller, Cytology, and Biopsy, which are treated as independent binary classification tasks rather than a single multilabel classification problem. The process began by preparing a dataset of cervical cancer risk factors through MICE imputation, then applying SMOTE and ADASYN to address class imbalance. The XGBoost model is optimized using Random Search hyperparameter tuning and evaluated across train-test split ratios (50:50, 60:40, 70:30, 80:20, and 90:10) using accuracy, precision (macro, micro, weighted), recall (macro, micro, weighted), F1-score (macro, micro, weighted), and AUC metrics. The results indicated that the XGBoost setup with MICE and SMOTE outperformed the others, achieving 97.1% accuracy, 97.1% mic-precision, 97.1% mic-recall, 97.1% mic-F1, and 97.1% AUC. Meanwhile, the ADASYN-integrated model showed marginally lower results, with 95.4% accuracy, 95.4% micro-precision, 95.4% micro-recall, 95.4% micro-F1, and 55.5% AUC. SMOTE proved more adept at creating evenly distributed synthetic data for the underrepresented group. Overall, this work underscores the value of integrating MICE imputation, SMOTE oversampling, and tuned XGBoost as a reliable approach for cervical cancer detection. These insights pave the way for automated screening tools that can bolster clinical judgment and improve early diagnosis outcomes.

Copyrights © 2026






Journal Info

Abbrev

jeeemi

Publisher

Subject

Computer Science & IT Control & Systems Engineering Electrical & Electronics Engineering Engineering

Description

The Journal of Electronics, Electromedical Engineering, and Medical Informatics (JEEEMI) is a peer-reviewed open-access journal. The journal invites scientists and engineers throughout the world to exchange and disseminate theoretical and practice-oriented topics which covers three (3) majors areas ...