Sentiment analysis is a crucial technique in understanding public opinion, particularly on social media platforms such as YouTube. However, the presence of informal language, including slang words, poses significant challenges to accurate sentiment classification. This study aims to enhance sentiment analysis by implementing a Support Vector Machine (SVM) classifier combined with SMOTEENN data balancing techniques to address class imbalance issues. The research collects 3,375 YouTube comments on the movie Pengabdi Setan 2: Communion using the YouTube Data API. The preprocessing steps include text cleaning, tokenization, stopwords removal, stemming, and slang word normalization using kamusalay.csv to ensure standardization of informal expressions. The extracted features are represented using TF-IDF, and sentiment labeling is performed using VADER. Experimental results show that the SVM model achieves an accuracy of 98%, but struggles with detecting negative sentiments, as indicated by lower recall (38%) and F1-score (53%) for the negative class. Although the application of SMOTEENN improves data balance, further refinements, such as adjusting classification thresholds and integrating deep learning techniques, are necessary to enhance sentiment detection, particularly for informal and emotionally nuanced language. This study contributes to improving sentiment analysis models by demonstrating the effectiveness of slang word normalization in handling non-standard language variations. Future work will explore more sophisticated language models to enhance accuracy in sentiment classification.
Copyrights © 2025