Jurnal Linguistik Komputasional
Vol 2 No 1 (2019): Vol. 2, No. 1

Identifikasi Konten Kasar Pada Tweet Bahasa Indonesia

Ahmad Fathan Hidayatullah (Unknown)
Aufa Aulia Fadila (Unknown)
Kiki Purnama Juwairi (Unknown)
Royan Abida Nayoan (Program Studi Sarjana Teknik Informatika, Universitas Islam Indonesia)



Article Info

Publish Date
25 Mar 2019

Abstract

This study aims to identify tweets containing abusive or offensive content. To do this, we performed five steps, such as, data collection, preprocessing, feature extraction, classification, and evaluation. We employed Multinomial Naïve Bayes and Support Vector Machine with linear kernel as our classification algorithm. Based on the experiment, it is known that the performance of the Support Vector Machine algorithm with linear kernel is superior overall compared to the Multinomial Naïve Bayes algorithm. It can be seen from the result of the values ​​of accuracy, precision, recall, and F1-score for the SVM algorithm, respectively 0.9928; 0.9914; 0.9946; and 0.9930. Whereas the value of accuracy, precision, recall, and F1-score of the Multinomial Naïve Bayes algorithm are 0.9834; 0.9912; 0.9762; and 0.9836. However, it can be concluded that the Support Vector Machine and Multinomial Naïve Bayes algorithm have almost the same performance. This is evidenced by the difference in performance achievements that are not too striking from both algorithm.

Copyrights © 2019






Journal Info

Abbrev

jlk

Publisher

Subject

Computer Science & IT

Description

Jurnal Linguistik Komputasional (JLK) menerbitkan makalah orisinil di bidang lingustik komputasional yang mencakup, namun tidak terbatas pada : Phonology, Morphology, Chunking/Shallow Parsing, Parsing/Grammatical Formalisms, Semantic Processing, Lexical Semantics, Ontology, Linguistic Resources, ...