In today’s digital era, emotion analysis of social media comments plays a critical role in gaining deeper insights into user sentiment. This study aims to compare two text representation methods TF-IDF and CountVectorizer in enhancing the performance of the Support Vector Machine (SVM) algorithm for emotion classification. The dataset employed in this research is a subset of GoEmotions, consisting of 1,000 YouTube comments labeled with 27 distinct emotion categories. The dataset was split into training and testing sets with an 80:20 ratio. Both text representation methods were tested separately using a linear kernel in the SVM algorithm. The models were evaluated based on accuracy, precision, recall, and F1-score. The classification results show that TF-IDF slightly outperformed CountVectorizer in terms of accuracy (35% vs. 32%). However, CountVectorizer exhibited marginally better performance in precision and F1-score. These findings suggest that the choice of text representation significantly impacts emotion classification outcomes. This research contributes to the development of text-based emotion analysis systems for social media platforms.
Copyrights © 2025