Jurnal Ilmu Komputer
Vol 17 No 1 (2024): Jurnal Ilmu Komputer

Balancing Dataset Untuk Klasifikasi Komentar Program Kampus Merdeka Menggunakan Synonym Replacement

Nifanto, Soleh (Unknown)



Article Info

Publish Date
30 Apr 2024

Abstract

The classification of comments in the Merdeka Campus program is an essential step in analyzing user sentiment towards the various features and services offered by the program. However, in the dataset processed in this study, problems are encountered, namely the imbalance of the amount of data in each class. The Imbalanced Ratio in this dataset is relatively high by 5:1. This generally leads to a classification model that prioritizes the majority class and results in low performance in the minority class. Therefore, a data augmentation approach is used in this study with the Synonym Replacement method to produce data variations in minority classes, thereby reducing the imbalance and improving classification performance. This method utilizes the technique of replacing synonyms in sentences in comments to enrich the dataset and increase the representational features. The study's results showed an increase in the F-Measure value from 0.6672 to 0.7875. Evaluation using ROC shows a maximum value of 0.96. In contrast, the class that did not get augmentation tended to have low ROC values between 0.81 to 0.88.

Copyrights © 2024






Journal Info

Abbrev

jik

Publisher

Subject

Computer Science & IT Languange, Linguistic, Communication & Media Library & Information Science

Description

JIK is a peer-reviewed scientific journal published by Informatics Department, Faculty of Mathematics and Natural Science, Udayana University which has been published since 2008. The aim of this journal is to publish high-quality articles dedicated to all aspects of the latest outstanding ...