IJISCS (International Journal Of Information System and Computer Science)
Vol 6, No 3 (2022): IJISCS (International Journal of Information System and Computer Science)

THE EFFECTS OF FEATURE SELECTION METHODS ON THE CLASSIFICATIONS OF IMBALANCED DATASETS

Femi Dwi Astuti (Informatic, Universitas Teknologi Digital Indonesia, Daerah Istimewa Yogyakarta)
Indra Yatini Buryadi (Informatic, Universitas Teknologi Digital Indonesia, Daerah Istimewa Yogyakarta)



Article Info

Publish Date
17 Dec 2022

Abstract

imbalanced data often results in less than optimal classification. Also, datasets with a large number of attributes tends to make the classification results not too good, and in order get better classification accuracy results, one thing that could be done is to perform pre-processing to select the features to be used in the classification. This research uses information gain and gain ratio feature selection algorithms for the pre-processing stage prior to classification, and Naïve Bayes algorithm for the classification. The test is performed to determine the values of accuracy, precision, recall from the classification process without feature selection; accuracy value with information gain feature selection; accuracy value with gain ratio; and accuracy value with CBFS feature selection. The results are then compared to determine which feature selection algorithm gives the best results when applied to data with imbalanced classes. The results showed that the classification accuracy on the default of credit card client dataset using Nave Bayes algorithm was 64.27%. The information gain feature selection was able to increase the accuracy by 5.27% (from 64.27% to 69.54%), while the gain ratio feature selection was able to increase the accuracy by 14.19% (from 64.27% to 78.46%). In this case, the gain ratio is more suitable for data with greatly varied attribute values.

Copyrights © 2022






Journal Info

Abbrev

ijiscs

Publisher

Subject

Computer Science & IT

Description

The International Journal Information System and Computer Science (IJISCS) is a publication for researchers and developers to share ideas and results of software engineering and technologies. These journal publish some types of papers such as research papers reporting original research results, ...