TIN: TERAPAN INFORMATIKA NUSANTARA
Vol 6 No 11 (2026): April 2026

SMOTE-Based Oversampling for Imbalanced Digital Fraud Risk Classification

Fitriana, Ika Nur Laily (Unknown)
Leviany, Fonda (Unknown)
Kasmiarno, Kurnia Sari (Unknown)
Mabruri, Mohammad Okky (Unknown)



Article Info

Publish Date
30 Apr 2026

Abstract

Digital fraud risk among university students is an important issue, yet classification using survey-based indicators is complicated by class imbalance. This study examined whether Synthetic Minority Over Sampling Technique (SMOTE) improves Digital Fraud Risk classification among Universitas Terbuka students. This research used primary survey data from 498 respondents and modeled using five predictors representing financial literacy, digital financial literacy, monthly gross income, age, and job tenure. The evaluated models were Gaussian Naive Bayes, Random Forest, calibrated linear Support Vector Machine (SVM), Radial Basis Function SVM, and XGBoost. The performance of model was evaluated using confusion matrix, accuracy, balanced accuracy, precision, recall, F1 score, ROC-AUC, PR-AUC, MCC and Kappa. This research revealed that without oversampling, some models showed higher nominal accuracy but zero recall for High risk. It means that accuracy is insufficient for model selection under imbalance. In contrast, SMOTE increased recall for the High risk class across all models and improved PR AUC in several cases. The SMOTE based Random Forest achieved the highest test PR AUC (0.415), whereas the SMOTE based RBF SVM achieved the highest recall (0.659). Diagnostic analyses for the selected SMOTE based Random Forest provided evidence of non-random predictive signal, although overall discriminative performance remained moderate.

Copyrights © 2026