Journal of Applied Data Sciences
Vol 7, No 2: May 2026

Optimizing Monkeypox Detection Using Advanced Class Imbalance Handling Methods: Smote, Smote-Enn, Smote-Tomek, Borderline-Smote

Rizki, Fahlul (Unknown)
Widowati, Widowati (Unknown)
Widodo, Catur Edi (Unknown)



Article Info

Publish Date
26 Apr 2026

Abstract

Monkeypox is a zoonotic viral disease with increasing global concern due to its rapid spread and potential public health impact. Accurate and timely detection is crucial, yet the development of machine learning-based detection systems is often challenged by class imbalance in clinical datasets, leading to biased predictions towards majority classes. This study systematically evaluates the effectiveness of various class imbalance handling techniques, including SMOTE, Borderline-SMOTE, SMOTE-ENN, and SMOTE-Tomek, on the performance of ensemble learning algorithms, specifically Random Forest and Gradient Boosting, for monkeypox detection. Using a dataset of 25,000 synthetic patient records with 11 clinical features, models were trained and validated through stratified 5-fold cross-validation. Performance metrics including accuracy, precision, recall, F1-score, and Area Under the Curve (AUC), along with ROC analysis, were employed to assess the impact of each augmentation method. Results indicate that hybrid methods, particularly SMOTE-ENN, significantly improve recall and F1-score, improving the detection of clinically important monkeypox-positive cases while maintaining adequate discriminative ability. Standard SMOTE and SMOTE-Tomek provide stable performance across metrics, whereas Borderline-SMOTE shows lower recall despite high precision. These findings highlight the importance of selecting appropriate class imbalance handling strategies tailored to the clinical objective, emphasizing sensitivity in detecting positive monkeypox cases. The study provides practical guidance for implementing reliable and robust machine learning models in early monkeypox detection, contributing to improved clinical decision-making and public health interventions.

Copyrights © 2026






Journal Info

Abbrev

JADS

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes ...