Claim Missing Document
Check
Articles

Found 1 Documents
Search

Hate Speech Analysis Using IndoBERT in YouTube Comments on the 2024 Indonesian Presidential Debate Video Agus Sasmito Aribowo; Yuli Fauziah; Yusna Bantulu; Shoffan Saifullah; Azfa Mutiara Ahmad Fubalo
Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control Vol. 11, No. 3, August 2026 (Article in Progress)
Publisher : Universitas Muhammadiyah Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.22219/kinetik.v11i3.2604

Abstract

A Hate speech in the digital political space during election campaigns has the potential to cause polarization and undermine the quality of public discussion. This study analyzes hate speech in YouTube comments related to the five stages of the 2024 Indonesian presidential debate. We used IndoBERT, a Transformer-based language model specifically trained in Indonesian, to classify comments into hate speech and non-hate speech categories. The dataset consists of 38,742 comments collected from official debate videos. The dataset was labeled using a combination of manual annotation (20%) and semi-supervised learning (80%) using a pseudo-labeling approach. Experimental results show that IndoBERT achieved an average accuracy of 89.7% and a macro F1-score of 0.89 across all stages. IndoBERT outperformed baseline models such as mBERT, SVM, and Random Forest. These findings suggest that IndoBERT is more effective in capturing the linguistic nuances and distinctive Indonesian political rhetoric than multilingual or classical models. This study contributes an Indonesian-language political dataset and a comprehensive evaluation of relevant hate speech detection models for further research. Keywords: hate speech, IndoBERT, 2024 presidential debate, semi-supervised learning.