This Author published in this journals
All Journal Jurnal Teknosains
Claim Missing Document
Check
Articles

Found 1 Documents
Search

ROS, SMOTE, SMOTE-ENN COMPARISON USING GNB and Adaboost Classifiers for Cervical Cancer Imbalanced Dataset Evvin Faristasari; Sirlus Andreanto Jasman Duli; Indri Dwi Agustin; Yuda Paraswistara; Bradika Almandin Wisesa; Vivin Mahat Putri
Jurnal Teknosains Vol 15, No 2 (2026): June
Publisher : Universitas Gadjah Mada

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.22146/teknosains.111431

Abstract

Cervical cancer continues to pose a significant health risk to women, especially when diagnosis occurs at a later stage. Early screening therefore plays an important role in reducing disease progression while increasing the possibility of successful treatment. In recent years, machine learning has been increasingly applied to support disease identification through data classification approaches. This study was conducted to compare the performance of classification models on a cervical cancer dataset by applying three resampling techniques, namely Random Over Sampling (ROS), Synthetic Minority Over-sampling Technique (SMOTE), and SMOTE-ENN, to handle data imbalance. The dataset was obtained from an opensource dataset and underwent several preprocessing stages, including the division of training and testing data, missing value examination, and imputation for incomplete records. Afterward, class distribution was analyzed to confirm the imbalance condition before the resampling process was applied. ROS was implemented by duplicating minority class instances, SMOTE generated synthetic samples through interpolation, while SMOTE-ENN combined oversampling with data cleaning. All experimental scenarios were then evaluated using Gaussian Naive Bayes and AdaBoost Classifier. The findings indicate that Gaussian Naive Bayes combined with ROS produced better recall performance than AdaBoost. This suggests that Gaussian Naive Bayes demonstrates higher sensitivity in identifying positive cases, particularly after minority class representation is improved. The results also emphasize that the evaluation of machine learning models, especially in medical applications, should not rely solely on accuracy but also consider precision and recall obtaining more reliable classification outcomes.