Habash, Hussein Kareem
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

A Comparative Study of Resampling Techniques for Handling Class Imbalance in Binary Classification Habash, Hussein Kareem
Jurnal Pendidikan Matematika Vol. 2 No. 4 (2025): August
Publisher : Indonesian Journal Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47134/ppm.v2i4.1990

Abstract

Class-imbalance skews most binary classifiers toward the majority class, hiding the very events that matter (e.g., fraud and malignancy). We present a clear, quick-to-replicate comparison of four representative resampling families—Random Over-Sampling (ROS), SMOTE, the hybrid SMOTE-ENN cleaner, and the ensemble balancer EasyEnsemble—paired with two widely used learners (Logistic Regression and Random Forest). Experiments run on two public tabular benchmarks that span extreme (0.17 % fraud) and moderate (2.3 % cancer) skew. A simple two-fold stratified split replaces heavy cross-validation, and each model is evaluated on the two metrics that matter most under imbalance: AUROC and PR-AUC. Results finish in under ten minutes on any laptop yet reproduce the qualitative hierarchy seen in much larger studies: SMOTE-ENN attains the best PR-AUC on both datasets, EasyEnsemble leads AUROC, and naïve ROS trails in every case. Three visuals—(i) an end-to-end pipeline schematic, (ii) a one-glance bar chart of class ratios, and (iii) a radar plot of mean PR-AUC scores—make the findings transparent at first sight. All code and figures come in a single Jupyter notebook (supplementary ZIP); running one command installs dependencies, and a second command reproduces every number and image. This streamlined study offers practitioners an evidence-based starting point while remaining fully reproducible for reviewers and students alike.