Phishing attacks represent one of the most significant cybersecurity threats in the digital era, with over 300,000 complaints reported globally in 2023. In Indonesia, the National Cyber and Crypto Agency reported anomalous traffic related to phishing reaching 47,231,390 incidents in 2023, making it one of the greatest threats to the national digital ecosystem. The complexity of increasingly sophisticated modern phishing attacks requires machine learning-based automatic detection approaches to overcome the limitations of ineffective manual detection methods. This study presents a comparative analysis of Random Forest and XGBoost algorithms for automatic phishing website detection using machine learning techniques. Although both algorithms have proven effective in the cybersecurity domain, comprehensive comparisons considering aspects of performance, interpretability, and computational efficiency in the context of phishing detection remain limited, creating a research gap that needs to be filled to optimize national phishing detection systems. The research methodology includes data collection, preprocessing, model implementation, hyperparameter optimization using randomized search with 5-fold stratified cross-validation, and comparative analysis. Experimental results demonstrate that optimized XGBoost delivers the best performance with 97.78% accuracy and 73% faster training time, while Random Forest offers interpretability advantages with 97.65% accuracy. Feature importance analysis reveals SSL certificate status and anchor URL characteristics as the most critical discriminative features. This study concludes that optimized XGBoost is the more optimal choice for production deployment of real-time phishing detection systems, while Random Forest is more suitable for scenarios requiring model transparency. These findings contribute to the development of national phishing detection systems that support the Indonesian government's digitalization program and protect the public from increasing cybersecurity threats.
Copyrights © 2025