Haeranisa Bella Krisanti
Universitas Dian Nuswantoro

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Comparative Analysis of TinyBERT, SVM, and Char-CNN Models for Phishing URL Detection Haeranisa Bella Krisanti; Chaerul Umam
Sistemasi: Jurnal Sistem Informasi Vol 15, No 5 (2026): Sistemasi: Jurnal Sistem Informasi
Publisher : Program Studi Sistem Informasi Fakultas Teknik dan Ilmu Komputer

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32520/stmsi.v15i5.6345

Abstract

Phishing is one of the most prevalent cybersecurity threats that exploits malicious URLs to deceive users and steal sensitive information. This study proposes a URL-based phishing detection method using the lightweight Transformer model TinyBERT and compares its performance with three baseline models: SVM based on character n-grams, Random Forest based on lexical URL features, and Char-CNN. The dataset used in this study consists of 49,750 URLs with multi-class labels (benign, defacement, malware, and phishing), which were subsequently binarized into phishing (label 1) and non-phishing (label 0). The data were divided using a stratified split into training, validation, and testing sets with a ratio of 70%–15%–15%. To address class imbalance, the TinyBERT model was trained using a weighted loss approach based on class weights. The evaluation was conducted using a confusion matrix, accuracy, precision, recall, F1-score, as well as ROC and Precision–Recall curves. Experimental results demonstrate that TinyBERT achieved the best performance, with an accuracy of 0.9925, phishing recall of 0.9512, and an F1-score of 0.9387. In addition, the model produced the lowest number of false negatives (22) compared with the baseline models. These findings indicate that TinyBERT is more effective in minimizing phishing URLs that are incorrectly classified as benign, making it more suitable for implementing URL-based phishing detection in cybersecurity systems.