Scientific Journal of Informatics
Vol 10, No 4 (2023): November 2023

Voting Classifier Technique and Count Vectorizer with N-gram to Identifying Hate Speech and Abusive Tweets in Indonesian

Riza Arifudin (Department of Computer Science, Universitas Negeri Semarang, Indonesia)
Dandi Indra Wijaya (Department of Computer Science, Universitas Negeri Semarang, Indonesia)
Budi Warsito (Department of Statistics, Diponegoro University, Indonesia)
Adi Wibowo (Department of Informatics, Diponegoro University, Indonesia)



Article Info

Publish Date
20 Nov 2023

Abstract

Purpose: The objective of this study is to identify hate speech and abusive tweets in Indonesian using a Voting Classifier technique and Count Vectorizer with N-grams. Voting Classifier technique involves combining multiple classifiers like Random Forest and Support Vector Machines to improve classification accuracy.Methods: This research begins by preprocessing the data. Voting classifier uses Support Vector Machine algorithm and Random Forest algorithm. Support Vector Machine and Random Forest serve as the estimators for the voting classifier. As for feature extraction, N-gram and count vectorizer were employed. The effectiveness of the suggested procedures is the desired outcome.Result: Combining the Voting Classifier approach with Count Vectorizer feature extraction and using 1 gram of N-grams, or 82.50%, resulted in the best accuracy. From this study, it can be inferred that the approach employed to identify hate speech and abusive tweets is extremely practical.Novelty: Combining multiple classifiers and using feature extraction techniques like count vectorizer and N-gram with machine learning algorithms can be used for sentiment analysis to differentiate between hate speech and abusive tweets.

Copyrights © 2023






Journal Info

Abbrev

SJI

Publisher

Subject

Computer Science & IT

Description

Scientific Journal of Informatics published by the Department of Computer Science, Semarang State University, a scientific journal of Information Systems and Information Technology which includes scholarly writings on pure research and applied research in the field of information systems and ...