Jurnal Informatika: Jurnal Pengembangan IT
Vol 11, No 2 (2026)

Generative AI vs SMOTE: Studi Kasus Penyeimbangan Data Teks pada Sentimen Analisis

Dyah Sulistyowati Rahayu (Universitas Pancasila)
Iman Paryudi (Universitas Pancasila)
Erin Divayaning (Universitas Pancasila)
Afni Puspita Zahra (Universitas Pancasila)
Arsya Yan Duribta (Universitas Pancasila)



Article Info

Publish Date
30 Apr 2026

Abstract

– Imbalanced data remains a major challenge in sentiment analysis, where the dominance of positive reviews often leads to biased classification results and weak recognition of minority classes. This study aims to address the imbalance problem by applying Large Language Models (LLM) to generate synthetic negative reviews and comparing the results with the traditional SMOTE method. The research process begins with data collection through web scraping, followed by preprocessing using standard text cleaning techniques such as tokenization, stopword removal, and stemming. Augmentation is then performed with LLM to produce additional negative samples, while SMOTE is applied as a baseline method. The classification task is conducted using Support Vector Machine (SVM) with TF-IDF representation, and model performance is evaluated using accuracy, precision, recall, and F1-score. The findings show that LLM augmentation produces synthetic data highly similar to the original distribution, as confirmed by Kolmogorov-Smirnov and Wasserstein Distance tests. Furthermore, the SVM model trained with LLM-augmented data achieved higher accuracy and balanced performance compared to SMOTE, particularly in handling minority classes. In conclusion, the use of LLM provides a more effective and natural approach for text data balancing in sentiment analysis, offering significant improvement in classification quality. Future research may explore the integration of LLM with other generative models to extend applications to numerical and multimodal datasets.

Copyrights © 2026






Journal Info

Abbrev

informatika

Publisher

Subject

Computer Science & IT

Description

The scope encompasses the Informatics Engineering, Computer Engineering and information Systems., but not limited to, the following scope: 1. Information Systems Information management e-Government E-business and e-Commerce Spatial Information Systems Geographical Information Systems IT Governance ...