In today's rapidly evolving digital landscape, detecting and preventing cyberattacks has become crucial for securing networks and data. This study evaluates the performance of several machine learning models, including RandomForest, GradientBoosting, XGBoost, LightGBM, CatBoost, Support Vector Classifier (SVC), Logistic Regression, and an ensemble Voting Classifier, in detecting and classifying cyberattacks. The models were tested on a real-world cybersecurity dataset with significant class imbalance, where benign traffic vastly outnumbers malicious attacks. Results showed that while some models, such as RandomForest and the Voting Classifier, achieved high training accuracy, they suffered from overfitting, with test accuracies not exceeding 34%. Boosting models like XGBoost and LightGBM exhibited better generalization than RandomForest but still struggled to handle the dataset complexity. The primary limitations of this study include the dataset's imbalance, the high dimensionality of the features, and the models’ tendency to overfit. These challenges highlight the need for more robust data preprocessing techniques, hyperparameter tuning, and exploration of advanced models, such as deep learning architectures, for future work. The findings provide insights into the challenges of using machine learning for cybersecurity attack detection and point toward future directions for improving model performance in real-world settings.
Copyrights © 2024