Claim Missing Document
Check
Articles

Found 1 Documents
Search

Comparative Performance of Machine Learning Algorithms for Detecting Online Gambling Promotional Comments on Youtube Michael Angelo; Robet; Hendrik, Jackri
Jurnal Teknologi dan Manajemen Informatika Vol. 11 No. 2 (2025): Desember 2025
Publisher : Universitas Merdeka Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26905/jtmi.v11i2.16286

Abstract

Online-gambling promoters increasingly exploit YouTube comment sections, using text obfuscation, Unicode characters, emojis, irregular spacing, and symbols to evade automated moderation. This study aims to identify the most effective machine-learning algorithm for detecting such promotional comments by comparing models on standard metrics (precision, recall, F1-score, accuracy). We employ semi-supervised pseudo-labelling to expand the labelled set from 1,648 to 9,111 comments without additional manual annotation, admitting only high-confidence predictions. The pipeline includes customised character normalization, selective cleaning, tokenization, stopword removal, and Nazief–Adriani stemming, followed by TF–IDF feature extraction. Four algorithms are evaluated: Multinomial Naive Bayes, Logistic Regression, Random Forest, and Support Vector Machine, with hyperparameter optimization and class balancing via SMOTE. On a 1,823-sample test set, all models achieve over 98% accuracy; SVM yields the most balanced performance, resulting in the highest F1-score for the promotion class (0.9908). Confusion matrices and learning curves indicate stable behavior without overfitting or underfitting. We therefore recommend SVM for operational deployment in automated moderation of gambling-promotion comments on YouTube. These findings provide practical guidance for platform safety teams and suggest methodological baselines for similar NLP moderation tasks. Future work should explore ensemble and deep learning approaches, incorporate character and subword-level features, and further evaluate robustness under adversarial obfuscation and domain shift.