Claim Missing Document
Check
Articles

Found 2 Documents
Search

Improving Multi-label Classification Performance on Imbalanced Datasets Through SMOTE Technique and Data Augmentation Using IndoBERT Model Cahya, Leno Dwi; Luthfiarta, Ardytha; Krisna, Julius Immanuel Theo; Winarno, Sri; Nugraha, Adhitya
Jurnal Nasional Teknologi dan Sistem Informasi Vol 9 No 3 (2023): Desember 2023
Publisher : Departemen Sistem Informasi, Fakultas Teknologi Informasi, Universitas Andalas

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.25077/TEKNOSI.v9i3.2023.290-298

Abstract

Sentiment and emotion analysis is a common classification task aimed at enhancing the benefit and comfort of consumers of a product. However, the data obtained often lacks balance between each class or aspect to be analyzed, commonly known as an imbalanced dataset. Imbalanced datasets are frequently challenging in machine learning tasks, particularly text datasets. Our research tackles imbalanced datasets using two techniques, namely SMOTE and Augmentation. In the SMOTE technique, text datasets need to undergo numerical representation using TF-IDF. The classification model employed is the IndoBERT model. Both oversampling techniques can address data imbalance by generating synthetic and new data. The newly created dataset enhances the classification model's performance. With the Augmentation technique, the classification model's performance improves by up to 20%, with accuracy reaching 78%, precision at 85%, recall at 82%, and an F1-score of 83%. On the other hand, using the SMOTE technique, the evaluation results achieve the best values between the two techniques, enhancing the model's accuracy to a high 82% with precision at 87%, recall at 85%, and an F1-score of 86%.
Comparing Optimizer Strategies For Enhancing Emotion Classification In IndoBERT Models Krisna, Julius Immanuel Theo; Luthfiarta, Ardytha; Cahya, Leno Dwi; Winarno, Sri; Nugraha, Adhitya
Advance Sustainable Science, Engineering and Technology Vol 6, No 2 (2024): February - April
Publisher : Universitas PGRI Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26877/asset.v6i2.18228

Abstract

Emotions are one of the reactions of human when they receive physical or verbal action. Every human action is based on emotion. Every opinion expressed in the comments column also contains the author's emotions. This research aims to classify five emotions, Marah, Takut, Senang, Cinta, and Sedih and evaluate the performance of three commonly used optimizer, Adam, RMSProp, and Nadam. The processed data used IndoBERT model for Indonesian text classification. The research purpose to search the best optimizer for text classification. The result shows classification used Adam Optimizer 90,21%, RMSProp Optimizer 82.11, and Nadam Optimizer 88.61%. The Adam optimizer applied to the IndoBERT model yielded the best results. This shows a significant improvement from previous studies, which had emotion classification.