IJCCS (Indonesian Journal of Computing and Cybernetics Systems)
Vol 14, No 1 (2020): January

Oversampling Method To Handling Imbalanced Datasets Problem In Binary Logistic Regression Algorithm

Windyaning Ustyannie (Prodi S2 Ilmu Komputer
FMIPA UGM, Yogyakarta)

Suprapto Suprapto (Departemen Ilmu Komputer and Elektronika, FMIPA UGM, Yogyakarta)



Article Info

Publish Date
31 Jan 2020

Abstract

The class imbalance is a condition when one class has a higher percentage than the other then it can affect the accuracy. One method in data mining that can be used to classification is logistic regression method. The method used in this research is RWO-sampling method using random replicate approach for synthetic data generation on descrete attribute. The result of the research can handle the problem of class imbalance, RWO-sampling method with random replicate approach shows better accuracy than RWO-sampling method with roulette and ROS approach. The accuracy value for RWO-Sampling method with roulette and RWO-Sampling approach with random replicate approach has increased to an average of 15.55% of each dataset. As for comparithem with the ROS method has increased an average of 3.7% of each dataset. Furthermore, for testing the underfitting problem in logistic regression, the oversampling method is better than non-oversampling with an increase in accuracy value reaching an average of 2.3% of each dataset.

Copyrights © 2020






Journal Info

Abbrev

ijccs

Publisher

Subject

Computer Science & IT Control & Systems Engineering

Description

Indonesian Journal of Computing and Cybernetics Systems (IJCCS), a two times annually provides a forum for the full range of scholarly study . IJCCS focuses on advanced computational intelligence, including the synergetic integration of neural networks, fuzzy logic and eveolutionary computation, so ...