Ce, Win
Unknown Affiliation

Published : 2 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 2 Documents
Search

STUDI KOMPARATIF ALGORITMA MACHINE LEARNING PADA ANALISIS SENTIMEN MEDIA SOSIAL Panjaitan, Febriyanti; Ce, Win; Oktafiandy, Hery; Kanugrahan, Ghanim; Ramdhani, Yudi; Hafizh Cahaya Putra, Vito; Permai, Antika
JATI (Jurnal Mahasiswa Teknik Informatika) Vol. 9 No. 2 (2025): JATI Vol. 9 No. 2
Publisher : Institut Teknologi Nasional Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.36040/jati.v9i2.13277

Abstract

Analisis sentimen di Twitter telah menjadi salah satu topik utama dalam penelitian terkait opini publik di bidang ekonomi, politik, dan isu sosial. Penggunaan machine learning dalam analisis sentimen memungkinkan untuk memproses data teks secara efisien. Penelitian ini bertujuan untuk mengeksplorasi literatur terkait analisis sentimen menggunakan metode machine learning pada Twitter dalam konteks ekonomi, politik, dan isu sosial. Metode yang digunakan adalah Systematic Literature Review (SLR), dengan pengumpulan artikel dari tiga database utama: IEEE Xplore, Google Scholar, dan Scopus. Setelah menerapkan kriteria inklusi dan eksklusi, 45 artikel relevan terpilih untuk dianalisis. Hasil penelitian menunjukkan bahwa Support Vector Machine (SVM) memiliki performa terbaik dengan akurasi rata-rata 85.3%, diikuti oleh Random Forest (83.7%) dan Naïve Bayes (81.5%). KNN dan Decision Tree menunjukkan performa lebih rendah, kemungkinan karena sensitivitas terhadap data yang tidak seimbang. Tren penelitian mengindikasikan bahwa analisis sentimen di bidang ekonomi lebih banyak berkaitan dengan dampak kebijakan ekonomi, di bidang politik fokus pada opini publik terkait pemilu dan kebijakan pemerintah, sementara di bidang isu sosial berkaitan dengan gerakan sosial dan kebijakan kesehatan.
Evaluation of Machine Learning Models for Sentiment Analysis in the South Sumatra Governor Election Using Data Balancing Techniques Panjaitan, Febriyanti; Ce, Win; Oktafiandi, Hery; Kanugrahan, Ghanim; Ramdhani, Yudi; Putra, Vito Hafizh Cahaya
Journal of Information System and Informatics Vol 7 No 1 (2025): March
Publisher : Universitas Bina Darma

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.51519/journalisi.v7i1.1019

Abstract

Sentiment analysis is crucial for understanding public opinion, especially in political contexts like the 2024 South Sumatra gubernatorial election. Social media platforms such as Twitter and YouTube provide key sources of public sentiment, which can be analyzed using machine learning to classify opinions as positive, neutral, or negative. However, challenges such as data imbalance and selecting the right model to improve classification accuracy remain significant. This study compares five machine learning algorithms (SVM, Naïve Bayes, KNN, Decision Tree, and Random Forest) and examines the impact of data balancing on their performance. Data was collected via Twitter crawling (140 entries) and YouTube scraping (384 entries), and text features were extracted using CountVectorizer. The models were then evaluated on imbalanced and balanced datasets using accuracy, precision, recall, and F1-score. The Decision Tree and Random Forest models achieved the highest accuracies of 79.22% and 75.32% on imbalanced data, respectively. However, they also exhibited overfitting, as indicated by their near-perfect training performance. Naïve Bayes, on the other hand, demonstrated the lowest accuracy at 54.55% despite achieving high precision, suggesting frequent misclassification, particularly for the minority class. SVM and KNN also struggled with imbalanced data, recording accuracies of 58.44% and 63.64%, respectively. Significant improvements were observed after applying data balancing techniques. The accuracy of SVM increased to 71.43%, and KNN improved to 66.23%, indicating that these models are more stable and effective when class distributions are even. These findings highlight the substantial impact of data balancing on model performance, particularly for methods sensitive to class distribution. While tree-based models achieved high accuracy on imbalanced data, their tendency to overfit underscores the importance of balancing techniques to enhance model generalization.