The rapid evolution of digital technology has catalyzed a shift in consumer behavior, particularly in online shopping activities facilitated by e-commerce platforms such as Tokopedia. User-generated reviews yield large-scale textual data that can be systematically analyzed to uncover consumer sentiment in a factual and structured manner. This study aims to evaluate and compare the performance of five sentiment classification algorithms Naive Bayes, K-Nearest Neighbors (KNN), Logistic Regression, Support Vector Machine (SVM), and Extra Trees Classifier based on user review data from Tokopedia. The analytical workflow begins with web crawling, followed by text preprocessing procedures including tokenization, case folding, and stop-word removal, culminating in sentiment classification using the aforementioned algorithms. Performance evaluation was conducted using four standard metrics accuracy, precision, recall, and F1-score. The results reveal that SVM achieved the highest accuracy at 85%, outperforming KNN and Extra Trees Classifier (84%), Logistic Regression (82%), and Naive Bayes (79%). SVM’s superior performance is attributed to its ability to identify optimal hyperplanes that effectively separate sentiment classes, particularly in high-dimensional feature spaces. These findings offer practical insights for developers of sentiment analysis systems in selecting the most effective algorithm, while reinforcing the strategic application of Natural Language Processing (NLP) techniques within Indonesia’s e-commerce landscape.
Copyrights © 2025