International Journal of Advances in Data and Information Systems
Vol. 7 No. 1 (2026): April 2026 - International Journal of Advances in Data and Information Systems

A Heterogeaneous Dataset–Driven Ensemble Learning Framework for Malicious URL Detection

Sukarno, Parman (Unknown)
Ngah, Syahrulanuar (Unknown)



Article Info

Publish Date
31 Mar 2026

Abstract

Modern cyberattacks are increasingly associated with phishing campaign, malware distribution, and website defacement, which are often delivered through malicious Uniform Resource Locator (URL) originating from diverse source. This paper examine malicious URL detection using an ensemble learning framework evaluated on large scale heterogeneous dataset composed of URL aggregated from multiple public threat intelligence source. The dataset include benign, phishing, malware, and defacement URL, thereby reflecting real world variability in attack pattern and data distribution. Three ensemble based classifier, namely Decision Tree (DT), Random Forest (RF), and Gradient Boosting (GB), are evaluated with respect to detection accuracy and computational efficiency. In addition to classification performance, this study present a detailed analysis of training and detection time in order to identify most suitable model for practical deployment. Experimental results indicate that the DT model achieves a training time of 4.14 seconds with macro and weighted accuracies of 94.11% and 91.71%, respectively, and a per category detection time of 0.2162 seconds. The RF model attains macro and weighted accuracies of 93.64% and 90.94%, with training and detection times of 9.73 seconds and 0.2420 seconds, respectively. Although the GB model exhibits the longest training time of 45.38 seconds, it achieves the fastest per category detection time of 0.2151 seconds. Despite its comparatively lower overall accuracy of 92.48% for macro averaging and 89.42s% for weighted averaging, the rapid inference capability of GB makes it a strong candidate for real time malicious URL detection in heterogeneous cybersecurity environments.

Copyrights © 2026






Journal Info

Abbrev

IJADIS

Publisher

Subject

Computer Science & IT Electrical & Electronics Engineering

Description

International Journal of Advances in Data and Information Systems (IJADIS) (e-ISSN: 2721-3056) is a peer-reviewed journal in the field of data science and information system that is published twice a year; scheduled in April and October. The journal is published for those who wish to share ...