Lontar Komputer: Jurnal Ilmiah Teknologi Informasi
Vol 13 No 1 (2022): Vol. 13, No. 1 April 2022

Deteksi Komentar Spam pada Instagram Menggunakan Machine Learning dan Deep Learning

Antonius Rachmat Chrismanto (Universitas Kristen Duta Wacana)
Afiahayati Afiahayati (Universitas Gadjah Mada Yogyakarta)
Yunita Sari (Universitas Gadjah Mada Yogyakarta)
Anny Kartika Sari (Universitas Gadjah Mada Yogyakarta)
Yohanes Suyanto (Universitas Gadjah Mada Yogyakarta)



Article Info

Publish Date
10 Aug 2022

Abstract

The more popular a public figure on Instagram (IG), the number of followers also increase. When a public figure posts something, there are many comments from other users. In fact, from all the comments, not all of them are relevant to the post, such as advertising, links, or clickbait comments. The type of comments that are irrelevant to the post is usually called spam comments. Spam comments will interfere with information flow and may lead to misleading information. This research compares machine learning (ML) and deep learning (DL) classification methods based on our collected Indonesian IG spam comment dataset. This research was conducted in the following steps: dataset preparation, pre-processing, simple normalization, features generation using TF-IDF and word embedding, application of ML and DL classification methods, performance evaluation, and comparison. The authors compare accuracy, F-1, precision, and recall from ML and DL results. This research shows that ML and DL methods do not significantly differ. The Linear SVM, Extreme Tree (ET), Regression, and Stochastics Gradient Descent algorithms can reach the accuracy of 0.93. At the same time, the DL method has the highest accuracy of 0.94 using the SimpleTransformer BERT architecture. The difference between ML and DL methods is not significantly different.

Copyrights © 2022






Journal Info

Abbrev

lontar

Publisher

Subject

Computer Science & IT

Description

Lontar Komputer [ISSN Print 2088-1541] [ISSN Online 2541-5832] is a journal that focuses on the theory, practice, and methodology of all aspects of technology in the field of computer science and engineering as well as productive and innovative ideas related to new technology and information ...