Jurnal Teknik Informatika (JUTIF)
Vol. 7 No. 1 (2026): JUTIF Volume 7, Number 1, February 2026

Natural Language Processing (NLP) and Support Vector Machine (SVM) Optimization in Detecting Phishing Website URLs

Aritonang, Mhd Adi Setiawan (Unknown)
Simanulang, Maradona Jonas (Unknown)
Batubara, Toras Pangidoan (Unknown)
Zega, Imanuel (Unknown)
Afrizal, M Hafis (Unknown)



Article Info

Publish Date
15 Feb 2026

Abstract

Phishing remains one of the most pervasive cyber-threats, with recent reports indicating a sharp rise in both volume and sophistication of attacks. According to the Anti‑Phishing Working Group, phishing incidents reached nearly 1 million in Q4 2024. To address this evolving threat, this study aims to develop an automated phishing-URL classification system based on Natural Language Processing (NLP) and Support Vector Machine (SVM). We utilised the Kaggle "PhiUSIIL Phishing URL Dataset" comprising 256,795 URL records and applied comprehensive preprocessing, feature extraction (structural URL features plus NLP-based keyword analysis), and SVM training with grid search optimisation. Evaluation was performed via confusion matrix and standard metrics of accuracy, precision, recall and F1-score. The best model achieved an accuracy of 99.99%, precision of 99.98%, recall of 100%, and F1-score of 99.99%. These results demonstrate that the combined NLP + SVM approach can effectively distinguish phishing from legitimate URLs with very high reliability. The proposed system contributes to cybersecurity by offering a feasible AI-based solution for real-time URL screening that can be integrated into browser extensions or enterprise email filters to bolster phishing defences.

Copyrights © 2026






Journal Info

Abbrev

jurnal

Publisher

Subject

Computer Science & IT

Description

Jurnal Teknik Informatika (JUTIF) is an Indonesian national journal, publishes high-quality research papers in the broad field of Informatics, Information Systems and Computer Science, which encompasses software engineering, information system development, computer systems, computer network, ...