Phishing is a fraudulent method in which attackers using fake websites steal user information such as login credentials and sensitive financial data. Therefore, this study compares four machine learning algorithms, namely CatBoost, XGBoost, Random Forest, and Decision Tree, in classifying phishing websites efficiently and accurately. In this study, the dataset used is the Web Page Phishing Dataset, which begins with exploration and preprocessing, which includes data cleaning, handling missing values, normalization, feature selection, and testing. Post-split. The data used has been divided into training data and test data, namely 80:20. The model was implemented using Python in Google Colaboratory. Model performance evaluation was measured in five main metrics, such as accuracy, precision, recall, F1-score, and AUC. The experimental results indicate that CatBoost achieved the best position with a performance of 89.57% in accuracy, 85.74% in F1-score, 88.73% in precision, 88.78% in recall, and 89.00% in AUC. XGBoost ranked second with a very competitive performance, followed by Random Forest, which was relatively stable with an accuracy value of 89.41% and an F1-score of 85.35%. On the other hand, the decision tree achieved the lowest performance with an accuracy of 88.69% and an F1-score of 84.10%. These performance results indicate limitations in handling complex data, as well as a tendency to overfit. Overall, ensemble boosting-based algorithms, especially CatBoost and XGBoost, outperform single trees in detecting phishing websites. These results will be benefical to?progress in the next generation for the construction of intelligent based phishing detection system under machine learning. In addition, the outcomes of this study will gain momentum for future works where hyperparameter optimization, larger datasets and real-time applications for phishing detection systems?can be focused. Furthermore, this work will contrast the application of ensemble?algorithm in the cybersecurity field.
Copyrights © 2026