Garuda - Garba Rujukan Digital

Article Per Year (5 Year)

p-Index From 2021 - 2026

0.23

P-Index

This Author published in this journals

All Journal International Journal of Advances in Data and Information Systems

Ngah, Syahrulanuar

Unknown Affiliation

Author-ID : 9807741

Computer Science & IT Electrical & Electronics Engineering

Published : 1 Documents Claim Missing Document

Claim Missing Document

Articles

A Heterogeaneous Dataset–Driven Ensemble Learning Framework for Malicious URL Detection Sukarno, Parman; Ngah, Syahrulanuar
International Journal of Advances in Data and Information Systems Vol. 7 No. 1 (2026): April 2026 - International Journal of Advances in Data and Information Systems
Publisher : Indonesian Scientific Journal

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.59395/ijadis.v7i1.1541

Modern cyberattacks are increasingly associated with phishing campaign, malware distribution, and website defacement, which are often delivered through malicious Uniform Resource Locator (URL) originating from diverse source. This paper examine malicious URL detection using an ensemble learning framework evaluated on large scale heterogeneous dataset composed of URL aggregated from multiple public threat intelligence source. The dataset include benign, phishing, malware, and defacement URL, thereby reflecting real world variability in attack pattern and data distribution. Three ensemble based classifier, namely Decision Tree (DT), Random Forest (RF), and Gradient Boosting (GB), are evaluated with respect to detection accuracy and computational efficiency. In addition to classification performance, this study present a detailed analysis of training and detection time in order to identify most suitable model for practical deployment. Experimental results indicate that the DT model achieves a training time of 4.14 seconds with macro and weighted accuracies of 94.11% and 91.71%, respectively, and a per category detection time of 0.2162 seconds. The RF model attains macro and weighted accuracies of 93.64% and 90.94%, with training and detection times of 9.73 seconds and 0.2420 seconds, respectively. Although the GB model exhibits the longest training time of 45.38 seconds, it achieves the fastest per category detection time of 0.2151 seconds. Despite its comparatively lower overall accuracy of 92.48% for macro averaging and 89.42s% for weighted averaging, the rapid inference capability of GB makes it a strong candidate for real time malicious URL detection in heterogeneous cybersecurity environments.

Co-Authors Parman Sukarno

Title

Found 1 Documents
Search

Abstract

Title Search

Found 1 Documents Search

Abstract

Title

Found 1 Documents
Search