JOIV : International Journal on Informatics Visualization
Vol 7, No 1 (2023)

Hybrid Approach with Distance Feature for Multi-Class Imbalanced Datasets

Hartono, Hartono (Unknown)
Ongko, Erianto (Unknown)



Article Info

Publish Date
13 Feb 2023

Abstract

The multi-class imbalance problem has a higher level of complexity when compared to the binary class problem. The difficulty is due to the large number of classes that will present challenges related to overlapping between classes. Many approaches have been proposed to deal with these multi-class problems. One is a hybrid approach combining a data-level approach and an algorithm-level approach. This approach is done by the ensemble on the classifier and also oversampling on the minority class. SMOTE is an oversampling method that provides good performance, but this method is necessary to determine the best sample used in the interpolation process to generate new samples. The need for determining the best sample is related to the overlap between classes that always accompanies the multi-class imbalance problem. The existence of overlap requires efforts to determine the safe region to synthesize the sample in the oversampling process in SMOTE. The safe region is considered the best for synthesizing samples due to the lower tendency of overlapping. It can be done by constructing distance features to determine the safe region. The sample with the best distance and the lowest imbalance ratio will be selected as a sample in the over-sampling process with SMOTE. The main contribution of this research is the proposed method of Hybrid Approach with Distance Feature so that it can determine safe samples, with the main advantage being in addition to handling multi-class imbalances, it is also better for handling overlapping. The results of this study will be compared with Multiple Random Balance (MultiRandBal) which performs a random oversampling process. The results showed that the Augmented R-Value, Class Average Accuracy, Class Balance Accuracy, and Hamming Loss obtained in this method was better than the random oversampling process. These results also show that the Hybrid Approach with Distance Feature provides better results in handling multi-class imbalances when compared to MultiRandBal.

Copyrights © 2023






Journal Info

Abbrev

joiv

Publisher

Subject

Computer Science & IT

Description

JOIV : International Journal on Informatics Visualization is an international peer-reviewed journal dedicated to interchange for the results of high quality research in all aspect of Computer Science, Computer Engineering, Information Technology and Visualization. The journal publishes state-of-art ...