Pratama, Moch Deny
Department of Informatics, Faculty of Intelligent Electrical and Informatics Technology, Institut Teknologi Sepuluh Nopember

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Ensemble Oversampling For Financial Fraud Classification Of Imbalanced Data Raharjo, Agus Budi; Pratama, Moch Deny; Purwitasari, Diana
IPTEK The Journal for Technology and Science Vol 34, No 3 (2023)
Publisher : IPTEK, DRPM, Institut Teknologi Sepuluh Nopember

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.12962/j20882033.v34i3.17183

Abstract

Financial fraud classification cases such as credit card fraud and bitcoin fraud have highly imbalanced data problems that the oversampling data of fraud class is necessary. Financial transactions could have different attributes. In a credit card transaction, the attributes could represent a nominal amount, transaction period information, the status of deposits or other types like withdrawals or refunds, and more detailed information. In the financial transaction of bitcoin, the attributes could represent the number of nodes, transaction fee, output volume, and aggregated figures. The various characteristics of attributes in financial fraud data require an adaptable oversampling method so that the classification model can perform well. An Ensemble Oversampling method is proposed as a general context approach to handling financial fraud classification in credit cards and bitcoin. The proposed method combines generative with traditional approaches such as GAN, SMOTE, and ADASYN. In the classification step, Deep Learning algorithms such as CNN and LSTM are applied to provide better performance. The genetic algorithm is used to optimize Deep Learning hyperparameters. The evaluation was carried out by comparing four scenarios, i.e., without oversampling, using oversampling with GAN, SMOTE, ADASYN, original data, and Ensemble Oversampling. The combined oversampling of GAN and SMOTE with the CNN classifier model produces the highest evaluation score of all scenarios with an average F1-Score value of 0.995 and Kappa Statistics of 0.990. It shows that augmented data quality does affect prediction performance, and Ensemble Oversampling technique could be considered to improve classifier performance in financial fraud data.