Purpose: The objective of this study is to identify hate speech and abusive tweets in Indonesian using a Voting Classifier technique and Count Vectorizer with N-grams. Voting Classifier technique involves combining multiple classifiers like Random Forest and Support Vector Machines to improve classification accuracy.Methods: This research begins by preprocessing the data. Voting classifier uses Support Vector Machine algorithm and Random Forest algorithm. Support Vector Machine and Random Forest serve as the estimators for the voting classifier. As for feature extraction, N-gram and count vectorizer were employed. The effectiveness of the suggested procedures is the desired outcome.Result: Combining the Voting Classifier approach with Count Vectorizer feature extraction and using 1 gram of N-grams, or 82.50%, resulted in the best accuracy. From this study, it can be inferred that the approach employed to identify hate speech and abusive tweets is extremely practical.Novelty: Combining multiple classifiers and using feature extraction techniques like count vectorizer and N-gram with machine learning algorithms can be used for sentiment analysis to differentiate between hate speech and abusive tweets.
Copyrights © 2023