Twitter is widely used by public figures, politicians, celebrities, and organizations to communicate with the public. However, theplatform's freedom of speech is often misused, leading to conflicts such as hate speech, especially against Islam. This study aims todevelop a text classification system for detecting hate speech against Islam and to evaluate the performance of Multinomial Naïve Bayes(MNB) in this task. The data was obtained through Twitter data crawling and processed through several pre-processing steps: cleaning,case folding, tokenizing, stop words removal, and stemming. The processed data was then transformed using Bag of Words to computeword frequency, which was input into MNB. The first test compared the ratio of training to test data, adjusting the alpha hyperparameterwithin its minimum and maximum ranges. The second test involved k-fold cross-validation for model validation. The results showedthe highest accuracy of 85% at a 90:10 training-to-test data ratio with the maximum alpha value. Using 10-fold cross-validation, themodel achieved an average accuracy of 79.09%, with the highest accuracy of 85.05% in the 4th iteration. This study demonstrates thatthe training/test data ratio, alpha parameter, and cross-validation influence MNB's performance in classifying hate speech.
Copyrights © 2024