Jurnal Teknologi dan Sistem Komputer
Volume 9, Issue 4, Year 2021 (October 2021)

Malicious URLs detection using data streaming algorithms

Kayode Sakariyah Adewole (Department of Computer Science, Faculty of Communication and Information Sciences, University of Ilorin. PMB 1515 Ilorin, Kwara State|University of Ilorin)
Muiz Olalekan Raheem (Department of Computer Science, Faculty of Communication and Information Sciences, University of Ilorin. PMB 1515 Ilorin, Kwara State|University of Ilorin)
Muyideen Abdulraheem (Department of Computer Science, Faculty of Communication and Information Sciences, University of Ilorin. PMB 1515 Ilorin, Kwara State|University of Ilorin)
Idowu Dauda Oladipo (Department of Computer Science, Faculty of Communication and Information Sciences, University of Ilorin. PMB 1515 Ilorin, Kwara State|University of Ilorin)
Abdullateef Oluwagbemiga Balogun (Department of Computer Science, Faculty of Communication and Information Sciences, University of Ilorin. PMB 1515 Ilorin, Kwara State|University of Ilorin)
Omotola Fatimah Baker (Department of Computer Science, Faculty of Communication and Information Sciences, University of Ilorin. PMB 1515 Ilorin, Kwara State|University of Ilorin)



Article Info

Publish Date
31 Oct 2021

Abstract

As a result of advancements in technology and technological devices, data is now spawned at an infinite rate, emanating from a vast array of networks, devices, and daily operations like credit card transactions and mobile phones. Datastream entails sequential and real-time continuous data in the inform of evolving stream. However, the traditional machine learning approach is characterized by a batch learning model. Labeled training data are given apriori to train a model based on some machine learning algorithms. This technique necessitates the entire training sample to be readily accessible before the learning process. The training procedure is mainly done offline in this setting due to the high training cost. Consequently, the traditional batch learning technique suffers severe drawbacks, such as poor scalability for real-time phishing websites detection. The model mostly requires re-training from scratch using new training samples. This paper presents the application of streaming algorithms for detecting malicious URLs based on selected online learners: Hoeffding Tree (HT), Naïve Bayes (NB), and Ozabag. Ozabag produced promising results in terms of accuracy, Kappa and Kappa Temp on the dataset with large samples while HT and NB have the least prediction time with comparable accuracy and Kappa with Ozabag algorithm for the real-time detection of phishing websites.

Copyrights © 2021






Journal Info

Abbrev

JTSISKOM

Publisher

Subject

Computer Science & IT Electrical & Electronics Engineering

Description

Jurnal Teknologi dan Sistem Komputer (JTSiskom, e-ISSN: 2338-0403) adalah terbitan berkala online nasional yang diterbitkan oleh Departemen Teknik Sistem Komputer, Universitas Diponegoro, Indonesia. JTSiskom menyediakan media untuk mendiseminasikan hasil-hasil penelitian, pengembangan dan ...