Jurnal Algoritma
Vol 23 No 1 (2026): Jurnal Algoritma

Klasifikasi Emosi pada Kalimat Bahasa Indonesia Menggunakan Transformer

Rey Aji Darusalam (Universitas Jenderal Achmad Yani)
Ridwan Ilyas (Universitas Jenderal Achmad Yani)
Fatan Kasyidi (Universitas Jenderal Achmad Yani)



Article Info

Publish Date
31 May 2026

Abstract

Text-based emotion classification is a challenging task in natural language processing, particularly for Indonesian, which has flexible sentence structures and varied informal language usage. This study aims to develop an emotion classification model for Indonesian sentences using a Transformer-based approach, specifically leveraging the IndoBERT model. The dataset employed is an adaptation of the SemEval 2025 benchmark, translated into Indonesian, and includes five emotion categories: anger, fear, joy, sadness, and surprise. The research process involves data preprocessing through IndoBERT tokenization, padding, label encoding, and the implementation of three data balancing strategies: Synthetic Minority Oversampling Technique (SMOTE), class weighting, and random oversampling. Fine-tuning of IndoBERT is conducted using the [CLS] token representation as the main feature for classification. Evaluation is performed for all balancing approaches using accuracy, precision, recall, and F1-score metrics. Results indicate that SMOTE achieves the highest accuracy at 58.31%, while the class weight approach yields the highest recall at 48.22%. Random oversampling demonstrates relatively stable performance across all metrics. The surprise emotion category is the most challenging class to recognize across all three approaches, highlighting the need for improvements in data and model design. Additionally, all models exhibit mild overfitting, as evidenced by performance differences between training and validation datasets. These findings demonstrate that IndoBERT can be effectively used for emotion classification in Indonesian sentences, with performance significantly influenced by the data balancing strategy employed. This study provides an initial insight valuable for the development of emotion classification systems that account for the context and characteristics of the Indonesian language.

Copyrights © 2026






Journal Info

Abbrev

algoritma

Publisher

Subject

Computer Science & IT

Description

Jurnal Algoritma merupakan jurnal yang digunakan untuk mempublikasikan hasil penelitian dalam bidang Teknologi Informasi (TI), Sistem Informasi (SI), dan Rekayasa Perangkat Lunak (RPL), Multimedia (MM), dan Ilmu Komputer (Computer ...