Sensitive information leaks are a growing concern in cybersecurity, often caused by insider threats. To address this, a Random Forest classification model was developed to detect user activities that may lead to data leaks. By applying SMOTE-ENN for class balancing and optimizing model parameters, the study achieved remarkable accuracy. The model demonstrated a strong performance with an average F1-Score of 0.9167 in cross-validation and 0.9231 on the test data, reflecting its ability to identify abnormal activities with a balanced approach to precision and recall. Specifically, the model detected abnormal activities with Recall of 94.28%, meaning it effectively identified most of the risky activities while minimizing false positives. The AUC-ROC score of 0.9721 highlights the model's ability to distinguish between normal and abnormal behaviors. The results indicate that Random Forest, paired with SMOTE-ENN and parameter optimization, is an effective tool for detecting data leakage risks and insider threats, with potential for use in information security systems to monitor suspicious activities.
Copyrights © 2025