Omar, Mohd. Nizam
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Improving Imbalanced Data Handling in Intrusion Detection Systems using SMOTE with an Extended Kalman Filter Guntoro, Guntoro; Omar, Mohd. Nizam; Mohsin, Mohamad Farhan Mohamad
JOIN (Jurnal Online Informatika) Vol 11 No 1 (2026)
Publisher : Department of Informatics, UIN Sunan Gunung Djati Bandung

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15575/join.v11i1.1687

Abstract

Class imbalance is a major hurdle when building intrusion detection systems (IDS). Most network traffic is normal, while certain types of attacks are very rare. This uneven distribution makes it hard for machine learning models to perform well—they often focus on the common traffic and miss the less frequent but critical attacks, like Remote to Local (R2L) and User to Root (U2R). To tackle this problem, this study proposes an improved oversampling method called SMOTE-EKF. It combines the Synthetic Minority Oversampling Technique (SMOTE) with the Extended Kalman Filter (EKF). By treating the creation of synthetic data as a nonlinear estimation problem, the EKF helps refine the generated samples, making them more accurate and reducing noise or overly broad boundaries. The method was tested on the NSL-KDD dataset using a Random Forest classifier, with performance evaluated through metrics like Accuracy, Precision, Recall, F1-score, G-Mean, and AUC-ROC, along with runtime analysis and cross-validation. The results show that SMOTE-EKF outperforms the baseline approaches, achieving impressive scores: 99.70% accuracy, 98.33% precision, 98.38% recall, 98.35% F1-score, a G-Mean of 98.29%, and an AUC-ROC of 0.993. Importantly, it also improves detection of rare attacks, with F1-scores of 96.76% for R2L and 93.65% for U2R. The SMOTE-EKF model proves to be more balanced in detecting all attack classes, without succumbing to overfitting. This study also suggests that incorporating predictive methods into the oversampling process can serve as a valuable strategy for improving the performance of machine learning-based intrusion detection systems.