Freedom of expression on Twitter often leads to issues such as hate speech, which may include provocation, incitement, or insults based on race, religion, gender, and other aspects. To address this issue, machine learning techniques can be applied to automatically classify hate speech. Therefore, this study aims to implement a machine learning–based approach for automatic hate speech aspect classification and to evaluate the accuracy of the obtained results. Support Vector Machine is used as the classifier method, with FastText as the word embedding method in the categorization process of hate speech aspects. The categorized aspects include abusive, individual, group, religion, race, physical, gender and other. The dataset used in this research is a collection of Indonesian tweets from Kaggle, which have been classified into each aspect. This study also tested combinations of preprocessing methods, namely filtering with stemming and the FastText pre-trained model. From the test results of the application of the Support Vector Machine method with FastText word embedding, with parameters C value = 1.0, gamma = 1.0 and RBF kernel and the ratio between training data and testing data is 90:10, the best results were obtained accuracy 98%, precision 98%, recall 98% and F1-Score 97% on Physical and Gender aspects. In addition, this study also tested if it did not use fasttext word embedding and the accuracy results showed 84%, precision 74%, recall 86% and F1 Score 79% in the abbusive aspect.
Copyrights © 2025