This study aims to compare the performance of TF-IDF and Word2Vec feature representations for emotion classification of Tokopedia e-commerce reviews using the LinearSVC algorithm. The dataset used is PRDECT-ID, which consists of 5,400 Indonesian-language reviews labeled with positive and negative emotions. The preprocessing stages include case folding, non-alphabet character cleaning, slang normalization, stopword removal, Sastrawi stemming, and emoji handling. Feature extraction was performed using TF-IDF and Word2Vec, after which the models were trained using LinearSVC and evaluated through 5-Fold Cross Validation and holdout testing. The experimental results show that TF-IDF achieves better performance, with an accuracy of 0.65, a macro-F1 score of 0.645, and a Cohen’s Kappa value of 0.294. Meanwhile, Word2Vec attains an accuracy of 0.58 and a macro-F1 score of 0.540. These findings indicate that TF-IDF is more effective for short and informal texts characteristic of Indonesian e-commerce reviews.
Copyrights © 2026