Simanulang, Maradona Jonas
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Natural Language Processing (NLP) and Support Vector Machine (SVM) Optimization in Detecting Phishing Website URLs Aritonang, Mhd Adi Setiawan; Simanulang, Maradona Jonas; Batubara, Toras Pangidoan; Zega, Imanuel; Afrizal, M Hafis
Jurnal Teknik Informatika (Jutif) Vol. 7 No. 1 (2026): JUTIF Volume 7, Number 1, February 2026
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2026.7.1.5334

Abstract

Phishing remains one of the most pervasive cyber-threats, with recent reports indicating a sharp rise in both volume and sophistication of attacks. According to the Anti‑Phishing Working Group, phishing incidents reached nearly 1 million in Q4 2024. To address this evolving threat, this study aims to develop an automated phishing-URL classification system based on Natural Language Processing (NLP) and Support Vector Machine (SVM). We utilised the Kaggle "PhiUSIIL Phishing URL Dataset" comprising 256,795 URL records and applied comprehensive preprocessing, feature extraction (structural URL features plus NLP-based keyword analysis), and SVM training with grid search optimisation. Evaluation was performed via confusion matrix and standard metrics of accuracy, precision, recall and F1-score. The best model achieved an accuracy of 99.99%, precision of 99.98%, recall of 100%, and F1-score of 99.99%. These results demonstrate that the combined NLP + SVM approach can effectively distinguish phishing from legitimate URLs with very high reliability. The proposed system contributes to cybersecurity by offering a feasible AI-based solution for real-time URL screening that can be integrated into browser extensions or enterprise email filters to bolster phishing defences.