The rise of social media platforms like YouTube has made them a primary medium for public discourse on socio-political issues, such as the "August 25th protests," which triggered massive polarization in the digital space. The vast volume of comments necessitates a computational approach for sentiment analysis. This study aims to classify public sentiment into positive and negative categories while comparing the performance of Naive Bayes, Random Forest, and Support Vector Machine (SVM). These algorithms were selected for their computational efficiency on high-dimensional text data compared to Deep Learning models. The methodology involved collecting 2,917 comments via the YouTube Data API v3, followed by text preprocessing, lexicon-based automated labeling, and TF-IDF feature weighting. To address the dataset's imbalance, where negative sentiment dominated at 78.8%, stratified sampling was applied to maintain class proportions. Results indicate that SVM achieved the highest accuracy at 88.2%, outperforming Random Forest (83.1%) and Naive Bayes (81.2%). SVM's superiority stems from its ability to find an optimal hyperplane that maximizes class margins, ensuring stability in imbalanced datasets. This research contributes a robust classification framework for understanding public opinion dynamics on specific political issues in Indonesia.
Copyrights © 2026