Sentiment analysis of e-commerce reviews is essential for understanding customer perceptions and supporting service and marketing decisions. However, previous SVM-based studies often report results using only one feature representation or one tuning approach, which provides limited guidance on the most effective practical configuration. This study addresses this gap by benchmarking a linear Support Vector Machine across TF IDF and Word2Vec representations and comparing three hyperparameter tuning strategies, Grid Search, Random Search, and Optuna, on an Indonesian language dataset of customer product reviews. The held-out test set contains 871 reviews, while class imbalance in the training data is handled by applying SMOTE only on the training set, resulting in a balanced training set of 2902 samples. Using stratified validation with Accuracy, Precision, Recall, F1 score, and ROC AUC, the best configuration is TF IDF with Optuna-tuned linear SVM, achieving 86.68 percent accuracy, an F1 score of 0.87, and ROC AUC of about 0.93 to 0.94. For Word2Vec, the best result is obtained with Random Search, reaching 84.38 percent accuracy, an F1 score of 0.84, and an ROC AUC of about 0.92. These findings indicate that TF-IDF is a stronger match for linear SVM in this setting, and that Optuna provides the most consistent gains for TF-IDF. Limitations include the use of binary sentiment labels and an evaluation focused on linear SVM with simple document-level Word2Vec aggregation, so performance may differ across other domains, platforms, and languages. Future research will examine richer document embeddings, nonlinear and contextual models, multi-class or aspect-level sentiment, and broader cross-platform validation to improve generalizability.
Copyrights © 2026