Abstrak – Seiring berkembangnya media sosial, YouTube telah menjadi salah satu tempat utama di mana orang dapat menggunakan kolom komentar untuk berbagi pemikiran mereka tentang film. Penelitian ini menggunakan kombinasi teknik berbasis leksikon dan machine learning untuk memeriksa sentimen penonton mengenai trailer film Andai Ibu Tidak Menikah dengan Ayah. Sejumlah langkah preprocessing, termasuk cleaning, case folding, normalisasi, tokenisasi, stopword removal, dan stemming, diterapkan pada data setelah dikumpulkan melalui scraping komentar YouTube. Leksikon Sentimen Indonesia (InSet Lexicon) digunakan untuk pelabelan sentimen, dan pendekatan TF-IDF digunakan untuk ekstraksi fitur. Synthetic Minority Oversampling Technique (SMOTE) digunakan untuk mengoreksi ketidakseimbangan data. Metode Naïve Bayes, Logistic Regression, dan Support Vector Machine (SVM) kemudian digunakan untuk mengklasifikasikan sentimen. Metrik akurasi, presisi, recall, dan F1-score digunakan untuk menilai kinerja model. Dengan akurasi 81.90%, presisi 79.95%, recall 81.90%, dan F1-score 80.87%, hasil ini menunjukkan bahwa algoritma Naïve Bayes berkinerja terbaik. Sementara itu, akurasi SVM dan Logistic Regression masing-masing adalah 66.67% dan 62.86%. Hasil ini menunjukkan bahwa Naïve Bayes mengungguli algoritma lain dalam klasifikasi sentimen dari komentar trailer film. Kata kunci : Analisis Sentimen; Lexicon-Based; Machine Learning; TF-IDF; YouTube; Abstract - As social media has grown, YouTube has become one of the main places where people may use comment sections to share their thoughts about movies. This study uses a combination of lexicon-based and machine learning techniques to examine viewer sentiment regarding the trailer for the film Andai Ibu Tidak Menikah dengan Ayah. A number of preprocessing steps, including as cleaning, case folding, normalization, tokenization, stopword removal, and stemming, were applied to the data after it was gathered via YouTube comment scraping. The Indonesian Sentiment Lexicon (InSet Lexicon) was used for sentiment labeling, and the TF-IDF approach was used for feature extraction. The Synthetic Minority Oversampling Technique (SMOTE) was used to correct data imbalance. The Naïve Bayes, Logistic Regression, and Support Vector Machine (SVM) methods were then used to classify sentiment. Accuracy, precision, recall, and F1-score metrics were used to assess the model's performance. With an accuracy of 81.90%, precision of 79.95%, recall of 81.90%, and F1-score of 80.87%, the findings show that the Naïve Bayes algorithm performed the best. In the meantime, the accuracy of SVM and Logistic Regression was 66.67% and 62.86%, respectively. These results show that Naïve Bayes outperforms the other algorithms in sentiment classification from movie trailer comments. Keywords: Sentiment Analysis; Lexicon-Based; Machine Learning; TF-IDF; YouTube;
Copyrights © 2026