BAREKENG: Jurnal Ilmu Matematika dan Terapan
Vol 20 No 3 (2026): BAREKENG: Journal of Mathematics and Its Application

ENHANCING E-COMMERCE REVIEW SENTIMENT ANALYSIS WITH LINEAR SVM: FEATURE-EXTRACTION AND HYPERPARAMETER COMPARISONS

Fauziah Hanum (Faculty of Economics and Business, Universitas Labuhanbatu, Indonesia)
Richi Andrianto (Information Systems, Institut Teknologi dan Sains Padang Lawas Utara, Indonesia)
Anita Sri Rejeki Hutagaol (Faculty of Economics and Business, Universitas Labuhanbatu, Indonesia)
Nurhanna Harahap (Faculty of Teacher Training and Education, Universitas Al Washliyah Labuhanbatu, Indonesia)
ibnu Rasyid munthe (Faculty of Science and Technology, Universitas Labuhanbatu, Indonesia)



Article Info

Publish Date
08 Apr 2026

Abstract

Sentiment analysis of e-commerce reviews is essential for understanding customer perceptions and supporting service and marketing decisions. However, previous SVM-based studies often report results using only one feature representation or one tuning approach, which provides limited guidance on the most effective practical configuration. This study addresses this gap by benchmarking a linear Support Vector Machine across TF IDF and Word2Vec representations and comparing three hyperparameter tuning strategies, Grid Search, Random Search, and Optuna, on an Indonesian language dataset of customer product reviews. The held-out test set contains 871 reviews, while class imbalance in the training data is handled by applying SMOTE only on the training set, resulting in a balanced training set of 2902 samples. Using stratified validation with Accuracy, Precision, Recall, F1 score, and ROC AUC, the best configuration is TF IDF with Optuna-tuned linear SVM, achieving 86.68 percent accuracy, an F1 score of 0.87, and ROC AUC of about 0.93 to 0.94. For Word2Vec, the best result is obtained with Random Search, reaching 84.38 percent accuracy, an F1 score of 0.84, and an ROC AUC of about 0.92. These findings indicate that TF-IDF is a stronger match for linear SVM in this setting, and that Optuna provides the most consistent gains for TF-IDF. Limitations include the use of binary sentiment labels and an evaluation focused on linear SVM with simple document-level Word2Vec aggregation, so performance may differ across other domains, platforms, and languages. Future research will examine richer document embeddings, nonlinear and contextual models, multi-class or aspect-level sentiment, and broader cross-platform validation to improve generalizability.

Copyrights © 2026






Journal Info

Abbrev

barekeng

Publisher

Subject

Computer Science & IT Control & Systems Engineering Economics, Econometrics & Finance Energy Engineering Mathematics Mechanical Engineering Physics Transportation

Description

BAREKENG: Jurnal ilmu Matematika dan Terapan is one of the scientific publication media, which publish the article related to the result of research or study in the field of Pure Mathematics and Applied Mathematics. Focus and scope of BAREKENG: Jurnal ilmu Matematika dan Terapan, as follows: - Pure ...