Claim Missing Document
Check
Articles

Found 1 Documents
Search

Classification of Hate Speech in TikTok Social Media Comments Using Naive Bayes Algorithm and TF-IDF Weighting Utami , Putri Febi; Krisbiantoro, Dwi; Santiko, Irfan; Riyanto, Andi Dwi
Journal of Multimedia Trend and Technology Vol. 4 No. 3 (2025): Journal of Multimedia Trend and Technology
Publisher : Universitas Amikom Purwokerto

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35671/jmtt.v4i3.102

Abstract

This research focuses on the classification of hate speech in Indonesian Tik Tok comments. Tik Tok, as a social media platform with high interaction intensity, generates a large volume of comments with diverse linguistic characteristics, including the use of formal and informal language. This linguistic variation poses challenges in the content moderation process, particularly in automatically identifying hate speech. The research dataset is secondary data obtained by combining public datasets and scraped Tik Tok comments, with an initial total of 5,698 comments. The collected data represent general user comments with variations in formal and informal language. To improve data quality, pre-processing stages were carried out including text cleaning, tokenization, normalization, stop-word removal, and stemming. After pre-processing, 4,542 comments were obtained that were suitable for use in the modeling process. Experimental results show that the Multinomial Naïve Bayes model with TF-IDF weighting is able to classify hate speech with high performance. Model accuracy reached 93% before parameter optimization and increased to 95% after hyperparameter tuning with an alpha value of 0.5. The confusion matrix results show a relatively low misclassification rate, although the class distribution in the dataset still shows imbalance. The findings of this study indicate that the Multinomial Naïve Bayes approach is effective in recognizing linguistic patterns of hate speech in Indonesian TikTok comments, including text with informal language characteristics.