Kuncoro Hadi
Universitas Amikom Yogyakarta

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Analysis of K-NN with the Integration of Bag of Words, TF-IDF, and N-Grams for Hate Speech Classification on Twitter Kuncoro Hadi; Ema Utami
JUITA: Jurnal Informatika JUITA Vol. 12 No. 2, November 2024
Publisher : Department of Informatics Engineering, Universitas Muhammadiyah Purwokerto

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30595/juita.v12i2.23829

Abstract

Social media has emerged as one of the primary communication channels in the modern world, but it has simultaneously become a platform where hate speech can spread easily. This study attempts to evaluate the performance of a hate speech classification model using the K-Nearest Neighbors (K-NN) algorithm along with various feature extraction techniques, specifically Bag of Words (BoW), Term Frequency-Inverse Document Frequency (TF-IDF), and N-Grams. The dataset used in this study consists of 13169 entries, which represent a diverse range of hate speech examples commonly encountered on social media platforms. In this experimental investigation, we assess the efficacy of the model using each feature extraction technique. The findings reveal that the K-NN model exhibits optimal performance when the k parameter is set to 3 (k=3). Under this configuration, the model achieves an accuracy of 86.88%, with a precision of 88.27%, a recall of 86.88%, and an F1-Score of 86.50%. These results show that the integration of TF-IDF feature extraction technique with K-NN algorithm produces superior performance in hate speech classification.