The IJICS (International Journal of Informatics and Computer Science)
Vol 6, No 1 (2022): March 2022

Optimizing Attack Detection for High Dimensionality and Imbalanced Data with SMOTE, Chi-Square and Random Forest Classifier

Kurniabudi Kurniabudi (Universitas Dinamika Bangsa, Jambi)
Abdul Harris (Universitas Dinamika Bangsa, Jambi)
Veronica Veronica (Universitas Dinamika Bangsa, Jambi)
Elvi Yanti (Universitas Dinamika Bangsa, Jambi)



Article Info

Publish Date
31 Mar 2022

Abstract

The rapid growth of the network generates a very large and varied amount of traffic which has an impact on data and information security. This study resolves two common problems in attack detection, namely high dimensionality and high-class imbalance of the network traffic. This study used the ISCX CICIDS-2017 dataset. This study used the ISCX CICIDS-2017 dataset.  The CICIDS-2017 dataset is imbalance that contains very diverse types of traffic including normal traffic and several types of attacks (multi-class). This study proposes a combination of the Chi-Square feature selection technique with the Tree-Based Classifier Random Forest. In the experiment first the Chi-Square Correlation Based feature selection technique was applied to the imbalance dataset. The selected features are then validated using several Random Forest algorithms. The test was also performed comparisons with other classification algorithms such as Naïve Bayes, Bayes Network, J48, REPTree, and Adaboost. This study also examines the implementation of SMOTE to overcome the problem of high calass imbalance. The test results also show that the proposed ensemble method has a very good performance from the Accuracy, TPR, FPR, Precision, F-Measure, and ROC values

Copyrights © 2022






Journal Info

Abbrev

ijics

Publisher

Subject

Computer Science & IT Control & Systems Engineering Electrical & Electronics Engineering

Description

The The IJICS (International Journal of Informatics and Computer Science) covers the whole spectrum of intelligent informatics, which includes, but is not limited to : • Artificial Immune Systems, Ant Colonies, and Swarm Intelligence • Autonomous Agents and Multi-Agent Systems • Bayesian ...