Brute-force attacks remain among the most prevalent and persistent cybersecurity threats to database systems, causing unauthorized access, data leakage, and service disruptions. Conventional thresholdbased detection methods often struggle to adapt to evolving and dynamic attack patterns, necessitating more robust anomaly detection approaches. This study aims to develop, evaluate, and compare two unsupervised machine learning algorithms—Standard Isolation Forest (IF) and Rotated Isolation Forest (RIF)—for detecting brute-force attacks targeting databases such as MariaDB. A large-scale raw access log dataset containing millions of entries was pre-processed through data cleaning, normalization, and feature extraction. Behavioural features were engineered for IP-path pairs, including login-attempt frequency, request intervals, and rapid-attempt ratios. The dataset consisted of 1,831,989 benign and 5,126,052 brute-force entries. The Standard IF model was trained using benign data (n estimators = 175, contamination = 0.1, max samples = ’auto’) and evaluated on mixed data, achieving Recall 99.94%, Precision 99.29%, F1-Score 99.61%, AUC 0.9495, and Accuracy 99.28%, with TP = 5,123,224 and FN = 2,828. The RIF model, using Gaussian Random Projection (n components = 5), yielded slightly lower metrics: Recall 99.44%, F1-Score 99.36%, and Accuracy 98.81%. The findings indicate that Standard Isolation Forest provides higher detection accuracy and reliability in identifying brute-force anomalies within large-scale log data. Despite the theoretical advantage of feature rotation in handling complex anomalies, the Standard IF demonstrates superior practical performance and efficiency. Overall, the study confirms the method’s strong potential for integration into automated and real-time cybersecurity monitoring systems.
Copyrights © 2025