Yudha, Muhammad Agung Reza
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Comparative Analysis of Random Forest and XGBoost Models for Cervical Cancer Risk Prediction using SHAP-based Explainable AI Yudha, Muhammad Agung Reza; Rahardi, Majid
Journal of Applied Informatics and Computing Vol. 9 No. 6 (2025): December 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i6.10357

Abstract

Cervical cancer remains one of the leading causes of cancer-related deaths among women, particularly in developing countries such as Indonesia. This study aims to develop an accurate and interpretable predictive model for cervical cancer risk using Random Forest (RF) and Extreme Gradient Boosting (XGBoost) algorithms. The dataset used is the Cervical Cancer Risk Factors from the UCI Repository, consisting of 858 patient records and 36 clinical and demographic features. The preprocessing stages include missing value imputation, class balancing using Synthetic Minority Oversampling Technique (SMOTE), and hyperparameter optimization through Randomized Search CV. Experimental results show that both models achieved high performance, with accuracy exceeding 96% and AUC above 0.95, while the XGBoost (Tuned + SMOTE) model slightly outperformed RF in detecting positive cases. The interpretability analysis using SHapley Additive exPlanations (SHAP) identified clinical features such as Schiller Test, Hinselmann Test, and Cytology Result as the most influential factors in the classification process, consistent with established clinical evidence. Therefore, the integration of XGBoost, SMOTE, and SHAP provides a predictive framework that is not only highly accurate but also clinically explainable, supporting the development of decision-support systems for early cervical cancer detection.