Claim Missing Document
Check
Articles

Sentiment Analysis Of Comments On Indonesian Political Speech Videos On Youtube Using FastText Khailla Savana, Bella Risma; Arifianto, Deni; Muharom, Lutfi Ali
Smart Techno (Smart Technology, Informatics and Technopreneurship) Vol. 7 No. 2 (2025)
Publisher : Primakara University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.59356/smart-techno.v7i2.159

Abstract

The advancement of digital technology has transformed how society accesses and responds to political information, particularly through platforms like YouTube, which serve as arenas for public discourse. Comments on political speech videos often contain complex sentiments such as irony, slang, and code-mixing, which are difficult to identify using traditional sentiment analysis methods. This study aims to analyze public sentiment toward the Indonesian President’s political speeches on YouTube from 2014 to 2024 using the FastText word embedding approach and to compare its performance with the TF-IDF + Logistic Regression method. The evaluation was conducted on three sentiment classes using automatically labeled data and oversampling experiments to address class imbalance. The results show that FastText achieved an accuracy of 76.82%, slightly higher than TF-IDF + Logistic Regression at 74.11%. Although the difference in accuracy is relatively small, the FastText model demonstrated more stable performance on informal texts and varied contexts. The use of oversampling helped balance predictions across classes without significantly improving accuracy. This study highlights the potential of FastText to enhance the effectiveness of Indonesian-language sentiment analysis, particularly for political comments on social media, while also revealing the limitations of automatic labeling that may affect classification outcomes.
Topic Analysis in Political Speech Video Transcripts Using the Latent Dirichlet Allocation (LDA) Method Septiara, Dhea Intan; Deni Arifianto; Wiwik Suharso
JUSTINDO (Jurnal Sistem dan Teknologi Informasi Indonesia) Vol. 11 No. 1 (2026): JUSTINDO
Publisher : Universitas Muhammadiyah Jember

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32528/justindo.v11i1.4044

Abstract

Political speeches are an important medium for conveying a country’s leader’s vision, mission, and policy directions to the public. This study aims to identify and analyze the main topics in the video transcripts of President Joko Widodo’s political speeches during the 2014–2024 period using the Latent Dirichlet Allocation (LDA) method. The data consist of 185 press conference speech videos obtained from the Indonesian Cabinet Secretariat’s YouTube channel and converted into text using speech-to-text technology. The dataset is divided into 81 videos from the 2014–2023 period as training data and 104 videos from 2024 as testing data. The analysis process includes text preprocessing, rule-based automatic labeling, LDA model training, and evaluation using coherence score and perplexity. The results show that in the training data, the topics of Infrastructure and Economy are the dominant topics, reflecting the government’s focus on physical development and economic growth. In contrast, in the 2024 testing data, Healthcare emerges as the most dominant topic, followed by the topics of Infrastructure, Economy, Education, and Technology. The Infrastructure topic consistently achieves the highest coherence score of 0.85, indicating strong semantic consistency among its constituent terms. This study contributes to understanding the temporal dynamics of political communication and demonstrates the effectiveness of LDA in analyzing political speech data derived from video transcripts.
Klasifikasi Emosi pada Data Teks Pidato Politik Menggunakan Metode RoBERTa Saputro, Safitri Ramadhayanti; Ramadhayanti Saputro, Safitri; Arifianto, Deni; Adi Cahyanto, Triawan
Jurnal Teknologi Informasi dan Ilmu Komputer Vol 13 No 1: Februari 2026
Publisher : Fakultas Ilmu Komputer, Universitas Brawijaya

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.25126/jtiik.2026131

Abstract

Analisis emosi dalam teks merupakan salah satu cabang penting dalam Natural Language Processing (NLP), khususnya dalam memahami pesan tersirat pada pidato politik. Pidato politik tidak hanya menyampaikan informasi, tetapi juga emosi yang bertujuan membentuk opini publik. Penelitian ini memanfaatkan model RoBERTa untuk mengklasifikasikan emosi dalam pidato Presiden Joko Widodo selama periode 2014–2024. Data diperoleh dari transkrip video resmi, menghasilkan 2952 paragraf yang telah dilabeli secara otomatis menggunakan model pre-trained `Indonesian-roberta-base-emotion-classifier`. Langkah praproses dilakukan melalui tahapan cleaning, lowercasing, tokenisasi, dan one-hot encoding. Selanjutnya, model RoBERTa dilakukan fine-tuning menggunakan batch size 16, learning rate 1e-5, dan 3 epoch. Evaluasi performa dilakukan dengan confusion matrix dan metrik akurasi, presisi, recall, dan F1-score. Hasil menunjukkan model mampu mengklasifikasikan lima emosi (anger, fear, happy, love, dan sadness) dengan akurasi 90%, presisi 91%, recall 90%, dan F1-score 0,91. Temuan ini menunjukkan bahwa RoBERTa efektif digunakan untuk klasifikasi emosi dalam teks pidato politik berbahasa Indonesia dan memberikan kontribusi terhadap pengembangan NLP dalam konteks komunikasi politik.   Abstract Emotion analysis in texts is a significant branch of Natural Language Processing (NLP), particularly in understanding implicit messages in political speeches. Political speeches not only convey information but also express emotions to shape public opinion. This study utilizes the RoBERTa model to classify emotions in the speeches of President Joko Widodo during the 2014–2024 period. The dataset was obtained from official video transcripts, resulting in 2952 paragraphs labeled automatically using the pre-trained model `Indonesian-roberta-base-emotion-classifier`. The preprocessing stages included text cleaning, lowercasing, tokenization, and one-hot encoding. The RoBERTa model was fine-tuned using a batch size of 16, a learning rate of 1e-5, and 3 epochs. Performance evaluation was conducted using a confusion matrix and metrics such as accuracy, precision, recall, and F1-score. The results show that the model can classify five emotions (anger, fear, happy, love, and sadness) with 90% accuracy, 91% precision, 90% recall, and an F1-score of 0.91. These findings demonstrate that RoBERTa is effective for emotion classification in Indonesian political speech texts and contributes to the development of NLP in political communication contexts.