Jurnal Algoritma
Vol 22 No 2 (2025): Jurnal Algoritma

Deteksi Komentar Spam Judi Online Berbahasa Indonesia Menggunakan XGBoost dan TF-IDF

Arrayyan, Dzakwan Rafi (Unknown)
Guntara, Rangga Gelar (Unknown)
Nugraha, Muhammad Rizki (Unknown)



Article Info

Publish Date
30 Nov 2025

Abstract

The phenomenon of online gambling continues to show growth with increasingly worrying trends. One of the challenges faced is the proliferation of gambling promotional comments on the YouTube platform due to the suboptimal performance of spam detection systems in recognizing manipulative language patterns. To address this issue, this study proposes a model for detecting spam comments in Indonesian using a combination of Term Frequency–Inverse Document Frequency (TF-IDF) and Extreme Gradient Boosting (XGBoost). The dataset contains 10,220 YouTube comments that have been manually labeled and processed through preprocessing stages, including unicode normalization and cleaning of irrelevant characters. The model was evaluated using 20% of the test data and produced an accuracy of 91%, precision of 92%, recall of 91%, and an F1-score of 91%. These results show that the combination of TF-IDF and XGBoost is effective for classifying short texts in YouTube comments. Thus, this study contributes to the development of Indonesian-language spam comment detection models, which are still rarely researched, and can also be used as a reference for media platforms in improving the effectiveness of stopping the spread of illegal content through social media comment sections.

Copyrights © 2025






Journal Info

Abbrev

algoritma

Publisher

Subject

Computer Science & IT

Description

Jurnal Algoritma merupakan jurnal yang digunakan untuk mempublikasikan hasil penelitian dalam bidang Teknologi Informasi (TI), Sistem Informasi (SI), dan Rekayasa Perangkat Lunak (RPL), Multimedia (MM), dan Ilmu Komputer (Computer ...