Bako, Nahum Zhema
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Breaking Class Imbalance Barriers in Intrusion Detection Systems: A Clustering-Based Hybrid Framework Hambali, Moshood Abiola; Bako, Nahum Zhema; Dalhatu, Mu’awuya; Ishaq, Ashraf
Scientific Journal of Computer Science Vol. 2 No. 1 (2026): June Article in Process
Publisher : PT. Teknologi Futuristik Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.64539/sjcs.v2i1.2026.378

Abstract

Intrusion Detection Systems (IDS) deal with issues concerning the ever-escalating level of sophistication observed within cyber threats. Nonetheless, IDS performance is deteriorated by class imbalance and excessively high-dimensional features, which cause biased classifier training towards major traffic patterns. Thus, this research introduces an innovative hybrid clustering IDS approach that utilizes MiniBatchKMeans clustering and ensemble machine learning strategies to mitigate these challenges. The suggested IDS approach utilizes the Synthetic Minority Over-sampling Technique for addressing class imbalance problems, Fast Correlation-Based Filter for reducing high-dimensional features, and Hyperopt Tree-structured Parzen Estimator for optimizing clustering and machine classifiers' parameters. Four supervised machine classifiers — Decision Tree classifier, Random Forest classifier, Extra Trees classifier, and XGBoost classifier— were trained and validated on the NSL-KDD IDS dataset. Additionally, experimental analysis indicated a superior detection accuracy for all classifiers, for which the best-optimized XGBoost classifier and best-optimized Random Forest classifier provided 99.57% and 99.51% accuracy, respectively. The proposed clustering-optimized machine IDS approach provided substantial improvements for identifying minority class attacks, along with sustainability and high generalization capabilities. The obtained outcomes support the research premise concerning the efficacy of cluster-aware sampling and ensemble optimizations for designing more balanced, accurate, and adaptive IDS systems for effectively protecting against ever-escalating real-life threats within the cyberworld.