Vocatech : Vocational Education and Technology Journal
Vol 8, No 1 (2026): April

Implementasi SMOTE dan GrideSearchCV untuk Klasifikasi Sentimen Imbalanced pada Isu Reshuffle Kabinet

Rofi'i, Muhamad (Unknown)
Vikri, Muhammad Jauhar (Unknown)
Rohmah, Roihatur (Unknown)



Article Info

Publish Date
27 Apr 2026

Abstract

AbstractSocial media has become a rapid barometer of public response to national issues. On X (Twitter), users not only share information but also shape opinions, criticism, and support toward political events. Sentiment analysis is therefore essential for capturing public perceptions explicitly and tracing the dynamics of policy acceptance in digital spaces. This study evaluates the effectiveness of SMOTE and GridSearchCV in improving machine learning performance for imbalanced sentiment classification. It is an applied quantitative study with a comparative computational experimental design. The dataset comprises 6,115 Indonesian-language tweets about a cabinet reshuffle, with (83.5%) negative and 16.5% positive sentiment. Data are preprocessed, labeled using a lexicon-based dictionary, vectorized with TF-IDF, and split into training and test sets. Logistic Regression, Naïve Bayes, SVM, and Random Forest are compared under three scenarios: baseline (no SMOTE), SMOTE, and SMOTE with GridSearchCV for hyperparameter search. Macro F1 is the primary metric, supported by a confusion matrix for per-class evaluation, ensuring minority-class performance is not obscured by high aggregate accuracy. Results show the baseline achieves high accuracy (86.02–89.13%) but low positive recall (21.29–42.57%) and macro F1 (62.83–75.09%), indicating majority-class bias. With SMOTE, positive recall rises (58.42–66.34%) and macro F1 improves (71.50–79.11%) while accuracy remains relatively stable (81.77–88.39%). The best model is Logistic Regression with (88.23%) accuracy, (78.90%) macro F1, and (65.84%) positive recall. In conclusion, SMOTE reduces majority-class bias, GridSearchCV stabilizes performance gains, and macro F1 proves more representative than accuracy for imbalanced sentiment data. AbstrakMedia sosial kini menjadi barometer cepat respon publik terhadap isu nasional. Melalui X/Twitter, masyarakat bukan hanya berbagi informasi, tetapi juga membentuk opini, kritik, dan dukungan atas peristiwa politik. Karena itu, analisis sentimen penting untuk menangkap persepsi publik secara eksplisit dan membaca dinamika penerimaan kebijakan di ruang digital. Penelitian ini bertujuan mengevaluasi efektivitas SMOTE dan GridSearchCV dalam meningkatkan kinerja algoritma machine learning pada klasifikasi sentimen yang tidak seimbang. Studi ini menggunakan metode kuantitatif terapan dengan desain eksperimen komputasional komparatif. Dataset penelitian 6.115 tweet berbahasa Indonesia bertopik reshuffle kabinet dengan distribusi 83,5% negatif dan 16,5% positif. Data dipra-pemrosesan, dilabeli memakai kamus lexicon-based, direpresentasikan dengan TF–IDF, lalu dipisah menjadi data latih dan uji. Klasifikasi diterapkan menggunakan Logistic Regression, Naïve Bayes, SVM, dan Random Forest pada tiga skenario: baseline tanpa SMOTE, SMOTE dengan baseline, serta SMOTE+GridSearchCV untuk pencarian hyperparameter. Macro F1 ditetapkan sebagai metrik utama, didukung confusion matrix untuk evaluasi per kelas. Hasil menunjukkan baseline memberi akurasi tinggi (86,02–89,13%), tetapi recall positif rendah (21,29–42,57%) dan macro F1 (62,83–75,09%), menandakan bias kelas mayoritas. SMOTE meningkatkan recall positif (58,42–66,34%) dan macro F1 (71,50–79,11%) dengan akurasi relatif stabil (81,77–88,39%). Kinerja terbaik dicapai Logistic Regression dengan akurasi (88,23%), macro F1 (78,90%), dan recall positif (65,84%). Simpulan penelitian menunjukkan SMOTE efektif menekan bias kelas mayoritas dan GridSearchCV menstabilkan peningkatan kinerja. Macro F1 terbukti lebih representatif daripada akurasi untuk data sentimen timpang, sehingga pendekatan ini layak diterapkan pada kasus klasifikasi teks tidak seimbang lainnya.

Copyrights © 2026






Journal Info

Abbrev

vocatech

Publisher

Subject

Civil Engineering, Building, Construction & Architecture Computer Science & IT Education Electrical & Electronics Engineering Mechanical Engineering

Description

1. Vocational Studies 2. Civil Engineering 3. Electrical Engineering 4. Mechanical Engineering 5. Classroom Instruction in Vocational Context 6. English for Vocational Purposes 7. Innovation in Vocational ...