Scientific Journal of Informatics
Vol 11, No 1 (2024): February 2024

Indonesian Hate Speech Text Classification Using Improved K-Nearest Neighbor with TF-IDF-ICSρF

Saputra, Nova Adi (Unknown)
Aeni, Khurotul (Unknown)
Saraswati, Nurul Mega (Unknown)



Article Info

Publish Date
25 Feb 2024

Abstract

Purpose: Freedom in social media gives rise to the possibility of disturbing users through the sentences they send, which is limited by the Electronic Information and Transactions Law (UU ITE). This research aims to find an effective method for classifying hate speech text data, especially in Indonesian, with many categories expected to minimize this case.Methods: This study used 1.000 data from Twitter with five labels, including religion, race, physical, gender and other (invective or slander). The process started with several steps of preprocessing, data transformation using TF-IDF-ICSρF term weighting and data mining using an Improved KNN algorithm. Then, the results were compared with the TF-IDF and KNN methods to evaluate the differences.Result: Using TF-IDF-ICSρF and Improved KNN algorithms gets an average accuracy value of 88.11%, 17.81% higher compared with the same data and parameters to the K-Nearest Neighbor and TF-IDF algorithms, which get results of 70.30%.Novelty: Based on the comparison results, TF-IDF-ICSρF and Improved KNN methods can effectively classify hate speech sentences that have many labels with fairly good accuracy.

Copyrights © 2024






Journal Info

Abbrev

SJI

Publisher

Subject

Computer Science & IT

Description

Scientific Journal of Informatics published by the Department of Computer Science, Semarang State University, a scientific journal of Information Systems and Information Technology which includes scholarly writings on pure research and applied research in the field of information systems and ...