Jurnal Sisfokom (Sistem Informasi dan Komputer)
Vol. 15 No. 3 (2026): JULY

Comparative Analysis of the Performance of Machine Learning and Deep Learning Methods in Detecting Hate Speech in Indonesia

Priscilla Desinta Achelya (Master of Informatics, Indonesian Institute of Business and Technology)
Ni Wayan Sumartini Saraswati (Master of Informatics, Indonesian Institute of Business and Technology)
I Putu Agus Eka Darma Udayana (Master of Informatics, Indonesian Institute of Business and Technology)



Article Info

Publish Date
05 Jun 2026

Abstract

The rapid expansion of social media usage in Indonesia has increased the spread of harmful online communication, including hate speech, which may contribute to social conflict and discrimination. As a result, automated hate speech identification has become an important research area in Indonesian natural language processing. Although many studies have applied machine learning and deep learning techniques for this task, comprehensive comparisons between conventional algorithms and transformer-based models in the Indonesian context remain limited. This study evaluates several machine learning algorithms, namely Naïve Bayes (NB), Support Vector Machine (SVM), Logistic Regression (LR), and Random Forest (RF), alongside the transformer-based IndoBERT model for Indonesian hate speech classification. All models were trained and evaluated using the same dataset, identical preprocessing stages, and consistent evaluation metrics consisting of accuracy, precision, recall, and F1-score to ensure fair comparison. Experimental findings show that IndoBERT achieved the strongest overall performance, reaching an accuracy of 87.45% and an F1-score of 84.92%. Among the classical machine learning approaches, Logistic Regression produced the highest result with an accuracy of 84.49% and an F1-score of 84.32%. While several machine learning models obtained relatively competitive recall values, IndoBERT demonstrated more stable performance across evaluation metrics and showed stronger capability in understanding contextual language patterns commonly found in Indonesian social media content. Overall, the study highlights the advantages and trade-offs between conventional machine learning and transformer-based deep learning approaches in Indonesian hate speech detection, while also providing practical insights for developing automated content moderation systems.

Copyrights © 2026






Journal Info

Abbrev

sisfokom

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

Jurnal Sisfokom merupakan singkatan dari Jurnal Sistem Informasi dan Komputer. Jurnal ini merupakan kolaborasi antara sivitas akademika STMIK Atma Luhur dengan perguruan tinggi maupun universitas di Indonesia. Jurnal ini berisi artikel ilmiah dari peneliti, akademisi, serta para pemerhati TI. Jurnal ...