Building of Informatics, Technology and Science
Vol 7 No 3 (2025): December 2025

Evaluasi Komparatif Algoritma Naïve Bayes, KNN, Logistic Regression, SVM, dan Extra Trees untuk Analisis Sentimen Tokopedia

Ciputra, Indramawan (Unknown)
Fahmi, Amiq (Unknown)



Article Info

Publish Date
08 Dec 2025

Abstract

The rapid evolution of digital technology has catalyzed a shift in consumer behavior, particularly in online shopping activities facilitated by e-commerce platforms such as Tokopedia. User-generated reviews yield large-scale textual data that can be systematically analyzed to uncover consumer sentiment in a factual and structured manner. This study aims to evaluate and compare the performance of five sentiment classification algorithms Naive Bayes, K-Nearest Neighbors (KNN), Logistic Regression, Support Vector Machine (SVM), and Extra Trees Classifier based on user review data from Tokopedia. The analytical workflow begins with web crawling, followed by text preprocessing procedures including tokenization, case folding, and stop-word removal, culminating in sentiment classification using the aforementioned algorithms. Performance evaluation was conducted using four standard metrics accuracy, precision, recall, and F1-score. The results reveal that SVM achieved the highest accuracy at 85%, outperforming KNN and Extra Trees Classifier (84%), Logistic Regression (82%), and Naive Bayes (79%). SVM’s superior performance is attributed to its ability to identify optimal hyperplanes that effectively separate sentiment classes, particularly in high-dimensional feature spaces. These findings offer practical insights for developers of sentiment analysis systems in selecting the most effective algorithm, while reinforcing the strategic application of Natural Language Processing (NLP) techniques within Indonesia’s e-commerce landscape.

Copyrights © 2025






Journal Info

Abbrev

bits

Publisher

Subject

Computer Science & IT

Description

Building of Informatics, Technology and Science (BITS) is an open access media in publishing scientific articles that contain the results of research in information technology and computers. Paper that enters this journal will be checked for plagiarism and peer-rewiew first to maintain its quality. ...