Khazanah Informatika : Jurnal Ilmu Komputer dan Informatika
Vol. 11 No. 2 (2025): October 2025

Hyperparameter Optimization of TF-IDF and SVM via Grid Search for Sentiment Analysis of Traveloka Customer Reviews

Muhammad Bayu Kurniawan (Universitas Amikom Yogyakarta)
Hanafi (Universitas AMIKOM Yogyakarta)
Riki Hikmianto (Universitas AMIKOM Yogyakarta)
Isnawati Muslihah (Institut Seni Indonesia Surakarta)



Article Info

Publish Date
02 Jun 2026

Abstract

Customer reviews on digital platforms are crucial for improving services and making business decisions. This study focuses on automated sentiment analysis for Traveloka, a leading Indonesian online travel application. We propose a systematic hyperparameter optimization of a combined TF-IDF and Support Vector Machine (SVM) pipeline. A dataset of 20,200 user reviews was collected from the Google Play Store. After preprocessing and a two-stage labeling process, the data was split using stratified sampling (70% training, 30% testing). We conducted a comprehensive Grid Search with stratified 5-fold cross-validation to jointly optimize TF-IDF n-gram ranges (unigram, bigram, trigram) and SVM hyperparameters across four kernel types (Linear, RBF, Polynomial, Sigmoid). The results show that the Polynomial kernel with trigram features (C=5, gamma=1, degree=5, coef0=10) performs best. It achieves a test accuracy of 87.10% and a macro F1-score of 86.9%. Error analysis revealed the model's high reliability in detecting negative feedback (precision: 90.4%) but also its difficulty with contrastive sentences and informal language. The minimal performance differences among top configurations suggest the task is robust to specific parameter choices. However, the model's bag-of-ngrams approach shows limitations in processing contrastive sentences and informal language. For future work, employing contextual embeddings (e.g., IndoBERT) and exploring alternative algorithms like Random Forest or Neural Networks could address these challenges. This research presents a thoroughly optimized traditional ML methodology that establishes a strong baseline for automated sentiment analysis of Indonesian user feedback.

Copyrights © 2025






Journal Info

Abbrev

khif

Publisher

Subject

Computer Science & IT Control & Systems Engineering

Description

Khazanah Informatika : Jurnal Ilmu Komputer dan Informatika, an Indonesian national journal, publishes high-quality research papers in the broad field of Informatics and Computer Science, which encompasses software engineering, information system development, computer systems, computer network, ...