Journal of International Multidisciplinary Research
Vol. 1 No. 1 (2023): November 2023

Preliminary Evaluation of Gaussian Naive Bayes for Multi-Label Hate Speech and Abusive Language Detection on Indonesian Twitter

Handayani, Tri Pratiwi (Unknown)
Hasyim, Wahyudin (Unknown)
Wati, Nursetia (Unknown)



Article Info

Publish Date
29 Nov 2023

Abstract

Automatic detection of hate speech and abusive language is crucial for combating online toxicity. This study explores Gaussian Naive Bayes for multi-label classification of hate speech on Indonesian Twitter, including target, category, and level. We combined TF-IDF features with contextual BERT embeddings. The model achieved balanced performance for general hate speech and good non-abusive language detection. However, it exhibited limitations with imbalanced data and specific hate speech types. The classifier consistently favored the majority class (non-hateful/non-abusive) across labels, particularly struggling with HS_Gender, HS_Physical, etc. This suggests difficulty detecting less frequent but potentially severe hate speech, likely due to limited training data. Overall accuracy and F1-scores confirm that while Gaussian Naive Bayes is efficient, it lacks robustness for nuanced multi-label classification with imbalanced datasets. This necessitates exploring alternative approaches for effectively detecting specific and less frequent hate speech.

Copyrights © 2023






Journal Info

Abbrev

jimr

Publisher

Subject

Other

Description

Journal of International Multidisciplinary Research is a scientific publication that aims to provide a broad platform for research, discussion, and deeper understanding across various disciplines. The journal welcomes contributions in all fields of science from various fields of study, including ...