MATRIK : Jurnal Manajemen, Teknik Informatika, dan Rekayasa Komputer
Vol 22 No 3 (2023)

Hate Speech Detection for Banjarese Languages on Instagram Using Machine Learning Methods

Muhammad Alkaff (Universitas Lambung Mangkurat, Banjarmasin, Indonesia)
Muhammad Afrizal Miqdad (Universitas Lambung Mangkurat, Banjarmasin, Indonesia)
Muhammad Fachrurrazi (Universitas Lambung Mangkurat, Banjarmasin, Indonesia)
Muhammad Nur Abdi (Universitas Lambung Mangkurat, Banjarmasin, Indonesia)
Ahmad Zainul Abidin (Universitas Lambung Mangkurat)
Raisa Amalia (Universitas Lambung Mangkurat, Banjarmasin, Indonesia)



Article Info

Publish Date
07 Jul 2023

Abstract

Hate speech refers to verbal expression or communication that aims to provoke or discriminate against individuals. The Ministry of Communication and Information of Indonesia has encountered and dealt with 3,640 cases of hate speech transmitted through digital channels between 2018 and 2021. Particularly in South Kalimantan, hate speech in the local language, Banjarese has become increasingly prevalent in recent years. Surprisingly, there is a lack of research on using machine learning to detect hate speech in the Banjarese language, specifically on Instagram. Therefore, this study aimed to address this gap by constructing a dataset of Banjarese language hate speech and comparing various feature extraction and machine learning models to detect Banjarese language hate speech effectively. Thisresearch used several feature extraction techniques and machine learning methods to detect Banjareselanguage hate speech. The feature extraction methods used were Word N-Gram, Term Frequency- Inverse Document Frequency (TF-IDF), a combination of Word N-Gram and TF-IDF, Word2Vec, and Glove, while the machine learning methods used were Support Vector Machine (SVM), Na¨ıve Bayes, and Decision Tree. The results of this study revealed that the combination of TF-IDF for feature extraction and SVM as the model achieves exceptional performance. The average Recall, Precision, Accuracy, and F1-Score score exceeded 90%, demonstrating the model’s ability to identify Banjarese hate speech accurately.

Copyrights © 2023






Journal Info

Abbrev

matrik

Publisher

Subject

Computer Science & IT

Description

MATRIK adalah salah satu Jurnal Ilmiah yang terdapat di Universitas Bumigora Mataram (eks STMIK Bumigora Mataram) yang dikelola dibawah Lembaga Penelitian dan Pengabadian kepada Masyarakat (LPPM). Jurnal ini bertujuan untuk memberikan wadah atau sarana publikasi bagi para dosen, peneliti dan ...