Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : JOURNAL OF APPLIED INFORMATICS AND COMPUTING

Comparative Analysis of CNN, Transformers, and Traditional ML for Classifying Online Gambling Spam Comments in Indonesian Manullang, Martin Clinton Tosima; Rakhman, Arkham Zahri; Tantriawan, Hartanto; Setiawan, Andika
Journal of Applied Informatics and Computing Vol. 9 No. 3 (2025): June 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i3.9468

Abstract

The rise of user-generated content on social media and live-streaming platforms has intensified the spread of spam, particularly online gambling (Judi Online) promotions, which remain prevalent in Indonesian comment sections. This study investigates the effectiveness of various machine learning (ML) and deep learning (DL) approaches in classifying such spam content in Bahasa Indonesia. We compare five models: Support Vector Machine (SVM), Random Forest (RF), a CNN-based model, IndoBERT, and a custom lightweight transformer model named Wordformer. While IndoBERT achieves the highest performance across all metrics, it comes with high computational demands. Wordformer, in contrast, delivers a strong balance between accuracy and efficiency, outperforming traditional models while being significantly more lightweight than IndoBERT. Wordformer achieved 0.9975 accuracy and macro F1-score, surpassing SVM (0.9578) and Random Forest (0.9729), while maintaining a significantly smaller model size and fewer multiply-add operations. An extensive ablation study further explores the architectural and training design choices that influence Wordformer’s performance. The findings suggest that lightweight transformer models can offer practical, scalable solutions for spam detection in low-resource language settings without the need for large pretrained backbones.