Claim Missing Document
Check
Articles

Found 1 Documents
Search

Comparison of Support Vector Machine and Naïve Bayes Algorithms Based on TF-IDF in Online Gambling Website Detection Refianti, Rina; Alhafiz, Husein
International Journal of Engineering, Science and Information Technology Vol 6, No 1 (2026)
Publisher : Malikussaleh University, Aceh, Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52088/ijesty.v6i1.1794

Abstract

The rapid growth of digital technology has significantly accelerated the spread of illegal online content, particularly gambling websites, which threaten social stability and regulatory enforcement. To address this issue, this study develops an automated detection system for online gambling sites using text classification with the Term Frequency–Inverse Document Frequency (TF-IDF) approach. A total of 1,225 website URLs were collected through web scraping, and after preprocessing, 1,166 valid entries were manually labeled into two classes: gambling and normal. The preprocessing steps included cleaning, tokenizing, stopword removal, stemming, and domain parsing, followed by feature extraction using TF-IDF, which generated 2,426 numerical features. To mitigate class imbalance, the Synthetic Minority Oversampling Technique (SMOTE) was applied to the training dataset. Two machine learning algorithms were implemented and compared: Support Vector Machine (SVM) with multiple kernels (Linear, RBF, Polynomial, and Sigmoid) and Multinomial Naïve Bayes (MNB). Experimental evaluation was conducted using accuracy, precision, recall, specificity, and F1-score metrics. Results demonstrate that SVM with the RBF kernel achieved the best performance, with an accuracy of 91.88% and an F1-score of 93.70%, while MNB obtained an accuracy of 88.46% and an F1-score of 91.00%. These findings confirm that SVM, particularly with the RBF kernel, delivers more stable and accurate performance in distinguishing gambling websites from normal ones. The proposed system offers a reliable foundation for the development of automated tools to monitor, detect, and block illegal online gambling content, thereby supporting regulatory enforcement and reducing the negative societal impacts of online gambling.