Conventional Intrusion Detection Systems often suffer from performance degradation due to their inability to handle the complexity of high-dimensional data and class imbalance in modern network traffic. This study aims to optimize the Network Intrusion Detection System (IDS) by addressing the limitations of the Random Forest algorithm in handling high-dimensional data and its lack of model transparency (black-box). The proposed method is a Hybrid model integrating an Autoencoder as a non-linear feature extractor and Random Forest as a classifier. The Autoencoder is trained using a semi-supervised strategy to generate latent features and Reconstruction Error (MSE), which serves as a robust anomaly indicator. Additionally, the Synthetic Minority Over-sampling Technique (SMOTE) is applied to address class imbalance in the NSL-KDD dataset. To address the challenge of interpretability, SHAP-based Explainable AI (XAI) is strategically implemented to elucidate the complex interactions between the Autoencoder-compressed latent features and the final classification decisions, thereby transforming this hybrid architecture into a transparent system. Evaluation results demonstrate that the Hybrid Autoencoder-Random Forest model outperforms the Random Forest Baseline, achieving an Accuracy increase of 2.54% (to 77.61%) and a Recall increase of 3.96% (to 62.31%). The significant improvement in the Recall metric empirically validates the effectiveness of hybrid features, specifically the Reconstruction Error, in detecting Zero-Day attacks characterized by unknown patterns. Furthermore, SHAP visualization successfully reveals the contribution of latent features, providing crucial transparency for network security forensic analysis.
Copyrights © 2026