J-Icon : Jurnal Komputer dan Informatika
Vol 10 No 1 (2022): Maret 2022

Pendekatan Resampling Data Untuk Menangani Masalah Ketidakseimbangan Kelas

Yosua Alberth Sir (Unknown)
Agus H H Soepranoto (Unknown)



Article Info

Publish Date
18 Mar 2022

Abstract

Imbalanced class problem (machine learning) is a problem that arises because of the significant difference in the number of instances between the minority class and the majority class. Imbalanced class ratio makes the classifier do the wrong decision when classifying, which tends to prefer the majority class and ignore the minority class. To tackle this problem, we use a data resampling approach that use 6 types of popular data resampling techniques, such as: (i) random oversampling (ROS), (ii) random undersampling (RUS), (iii) synthetic minority oversampling technique (SMOTE), (iv) adaptive synthetic sampling (ADASYN), (v) SMOTETomek, and (vi) SMOTEENN to balance the ratio of the number of instances of 15 types of datasets. Furthermore, this balanced dataset is classified using a random forest classifier. The metric used as a performance measurement tool is the geometric mean (G-Mean). To compare the performance of the 6 types of data resampling techniques, these G-Mean values were tested using Friedman's nonparametric statistical test, and if the null hypothesis was rejected, it was continued with Nemenyi's Post Hoc statistical test. Based on mean of ranks values, the best resampling technique is SMOTEENN (1.700), ADASYN (2.767), RUS (3.333), SMOTETomek (3.867), SMOTE (4.000), ROS (5.333).

Copyrights © 2022






Journal Info

Abbrev

jicon

Publisher

Subject

Computer Science & IT

Description

J-ICON : Jurnal Komputer dan Informatika focuses on the areas of computer sciences, artificial intelligence and expert systems, machine learning, information technology and computation, internet of things, mobile e-business, e-commerce, business intelligence, intelligent decision support systems, ...