Garuda - Garba Rujukan Digital

Journal of International Multidisciplinary Research

Vol. 1 No. 1 (2023): November 2023

Handayani, Tri Pratiwi (Unknown)
Hasyim, Wahyudin (Unknown)
Wati, Nursetia (Unknown)

Publish Date
29 Nov 2023

Automatic detection of hate speech and abusive language is crucial for combating online toxicity. This study explores Gaussian Naive Bayes for multi-label classification of hate speech on Indonesian Twitter, including target, category, and level. We combined TF-IDF features with contextual BERT embeddings. The model achieved balanced performance for general hate speech and good non-abusive language detection. However, it exhibited limitations with imbalanced data and specific hate speech types. The classifier consistently favored the majority class (non-hateful/non-abusive) across labels, particularly struggling with HS_Gender, HS_Physical, etc. This suggests difficulty detecting less frequent but potentially severe hate speech, likely due to limited training data. Overall accuracy and F1-scores confirm that while Gaussian Naive Bayes is efficient, it lacks robustness for nuanced multi-label classification with imbalanced datasets. This necessitates exploring alternative approaches for effectively detecting specific and less frequent hate speech.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

Journal of International Multidisciplinary Research

Website

Abbrev

jimr

Publisher

PT. Banjarese Pacific Indonesia

Subject

Other

Description

Journal of International Multidisciplinary Research is a scientific publication that aims to provide a broad platform for research, discussion, and deeper understanding across various disciplines. The journal welcomes contributions in all fields of science from various fields of study, including ...

Article Info

Abstract

Preliminary Evaluation of Gaussian Naive Bayes for Multi-Label Hate Speech and Abusive Language Detection on Indonesian Twitter

Article Info

Abstract