This study evaluates the performance of three tree boosting algorithms, Random Forest (RF), XGBoost (XGB), and LightGBM (LGBM), in detecting phishing websites using a phishing dataset based on HTML, URLs, and network features. Two hyperparameter optimization strategies were tested: Hyperband search (HalvingRandomSearchCV) and stacking ensemble combining all three models. The evaluation was conducted based on five main metrics: accuracy, precision, recall, F1-score, and AUC‑ROC. The results indicate that LightGBM tuned via Hyperband achieved the highest performance (accuracy 0.9724; AUC‑ROC 0.9702), followed by ensemble tuned (accuracy 0.9697; AUC‑ROC 0.9684). SHAP analysis was used to interpret the contribution of key features in predicting phishing websites. The AUC‑ROC difference of 0.0034 points from the XGBoost baseline (0.9668) confirms the effectiveness of Hyperband tuning and stacking ensembles for phishing detection
Copyrights © 2025