Sistemasi: Jurnal Sistem Informasi
Vol 15, No 5 (2026): Sistemasi: Jurnal Sistem Informasi

Improving Bioethanol Sentiment Analysis Performance using SMOTE in Machine Learning Model Comparison

Rajhu Ilham Pradana (Universitas Dinamika Bangsa)
Jasmir Jasmir (Universitas Dinamika Bangsa Indonesia)
Gunardi Gunardi (Universitas Dinamika Bangsa Indonesia)



Article Info

Publish Date
28 May 2026

Abstract

Sentiment analysis of public policies on social media is crucial for government evaluation; however, it is often challenged by highly imbalanced datasets. This study aims to address this issue through a case study on public sentiment toward bioethanol fuel policies on YouTube, where the cleaned dataset after preprocessing consisted of 2,409 comments dominated by negative sentiment (1,430 comments), followed by neutral sentiment (734 comments), and only a small number of positive sentiments (245 comments). The performance of classical Machine Learning (ML) models was severely degraded due to this imbalance, particularly in detecting the minority class. This study applied TF-IDF weighting for feature extraction, followed by the Synthetic Minority Oversampling Technique (SMOTE) to balance the training data (1,927 samples) before comparing the performance of three ML algorithms: Logistic Regression, Support Vector Machine (SVM), and LightGBM. The evaluation results on the testing dataset (482 samples) demonstrate that the implementation of SMOTE significantly improved the models’ ability to recognize the “Positive” class. The LightGBM model combined with SMOTE achieved the best performance, with an accuracy of 64.11%. In particular, the application of SMOTE successfully increased the minority-class F1-score from a baseline of 18.18% to 35.29%. These findings confirm that handling imbalanced data is a critical step in producing valid and reliable sentiment analysis results.

Copyrights © 2026






Journal Info

Abbrev

stmsi

Publisher

Subject

Computer Science & IT Electrical & Electronics Engineering

Description

Sistemasi adalah nama terbitan jurnal ilmiah dalam bidang ilmu sains komputer program studi Sistem Informasi Universitas Islam Indragiri, Tembilahan Riau. Jurnal Sistemasi Terbit 3x setahun yaitu bulan Januari, Mei dan September,Focus dan Scope Umum dari Sistemasi yaitu Bidang Sistem Informasi, ...