Journal of Information Technology and Computer Science
Vol. 10 No. 1: April 2025

Transfer Learning Methods for Hate Speech Detection in Bahasa Indonesia

Fairuz Astari Devianty (Unknown)



Article Info

Publish Date
02 May 2025

Abstract

Communication is becoming more accessible with the growth and emergence of social media platforms. However, this can be misused, such as for spreading hate speech. Building an efficient hate speech detection model requires sufficient annotated data to train the model. However, this is difficult as it requires more data for low-resource languages like Bahasa Indonesia. To address this issue, we study whether the transfer learning method can yield improved results. This study performs extensive experiments to show that transfer learning methods are suitable for low-resource hate speech prediction. Our experimental results show that transferring knowledge using a multilingual pre-trained language model and translating hate speech datasets as additional data can improve the performance of detecting hate speech in Bahasa Indonesia. By using the XLM-RoBERTa-based hate speech model for transfer learning improved the F1-score for hate speech detection in Bahasa Indonesia by 78%. Meanwhile, translating the data from English as additional data for training and using the BERT model to detect hate speech in Bahasa Indonesia improved the F1-score from 60% to 69%. These results were statistically proven by the McNemar test and evaluated using the ROC-AUC score.

Copyrights © 2025






Journal Info

Abbrev

jitecs

Publisher

Subject

Computer Science & IT Control & Systems Engineering Electrical & Electronics Engineering Engineering

Description

The Journal of Information Technology and Computer Science (JITeCS) is a peer-reviewed open access journal published by Faculty of Computer Science, Universitas Brawijaya (UB), Indonesia. The journal is an archival journal serving the scientist and engineer involved in all aspects of information ...