Garuda - Garba Rujukan Digital

Building of Informatics, Technology and Science

Vol 6 No 3 (2024): December 2024

Subhi, Yazid Abdullah (Unknown)
Agustian, Surya (Unknown)
Irsyad, Muhammad (Unknown)
Insani, Fitri (Unknown)

Publish Date
25 Dec 2024

Text classification is one of the most popular tasks in natural language processing, especially in the context of sentiment classification. Insufficient training data poses a significant challenge in many text classification studies. This research focuses on optimizing classification performance using the Passive Aggressive (PA) algorithm, leveraging limited training data. It compares conventional text representation methods like TF-IDF with modern approaches employing word embeddings such as FastText and BERT. The primary dataset encompasses sentiment issues related to Kaesang Pangarep's appointment as the chairman of PSI, gathered through Twitter crawling, and classified into positive, negative, and neutral sentiment labels. Two versions of the training data, each containing only 300 balanced tweets for positive, negative, and neutral classes, were used. The data was split 80% for training and 20% for validation in the search for an optimal model. External data with different issues and pre-existing sentiment labels was used to augment the training data. Experimental results demonstrated that the BERT language model, used as input features for the Passive Aggressive method with hyperparameter tuning, outperformed TF-IDF features. Evaluation on the test data revealed that BERT features with Passive Aggressive achieved an F1-score of 0.52, surpassing the conventional TF-IDF representation with an F1-score of 0.42. The utilization of the BERT language model significantly contributed to improving text classification performance in the field of natural language processing, particularly for the Passive Aggressive method.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

Building of Informatics, Technology and Science

Website

Abbrev

bits

Publisher

Forum Kerjasama Pendidikan TInggi

Subject

Computer Science & IT

Description

Building of Informatics, Technology and Science (BITS) is an open access media in publishing scientific articles that contain the results of research in information technology and computers. Paper that enters this journal will be checked for plagiarism and peer-rewiew first to maintain its quality. ...

Article Info

Abstract

Klasifikasi Sentimen Menggunakan Metode Passive Aggressive dengan Menggunakan Model Bahasa BERT pada Dataset Kecil

Article Info

Abstract