Dandi Indra Wijaya
Department of Computer Science, Universitas Negeri Semarang, Indonesia

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Voting Classifier Technique and Count Vectorizer with N-gram to Identifying Hate Speech and Abusive Tweets in Indonesian Riza Arifudin; Dandi Indra Wijaya; Budi Warsito; Adi Wibowo
Scientific Journal of Informatics Vol 10, No 4 (2023): November 2023
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/sji.v10i4.46633

Abstract

Purpose: The objective of this study is to identify hate speech and abusive tweets in Indonesian using a Voting Classifier technique and Count Vectorizer with N-grams. Voting Classifier technique involves combining multiple classifiers like Random Forest and Support Vector Machines to improve classification accuracy.Methods: This research begins by preprocessing the data. Voting classifier uses Support Vector Machine algorithm and Random Forest algorithm. Support Vector Machine and Random Forest serve as the estimators for the voting classifier. As for feature extraction, N-gram and count vectorizer were employed. The effectiveness of the suggested procedures is the desired outcome.Result: Combining the Voting Classifier approach with Count Vectorizer feature extraction and using 1 gram of N-grams, or 82.50%, resulted in the best accuracy. From this study, it can be inferred that the approach employed to identify hate speech and abusive tweets is extremely practical.Novelty: Combining multiple classifiers and using feature extraction techniques like count vectorizer and N-gram with machine learning algorithms can be used for sentiment analysis to differentiate between hate speech and abusive tweets.