Garuda - Garba Rujukan Digital

Article Per Year (5 Year)

p-Index From 2021 - 2026

0.23

P-Index

This Author published in this journals

All Journal JOURNAL OF APPLIED INFORMATICS AND COMPUTING

Ulhaq, Afif Langgeng Dhiya

Unknown Affiliation

Author-ID : 9366534

Computer Science & IT

Published : 1 Documents Claim Missing Document

Claim Missing Document

Articles

Comparing Machine Learning Models for Sentiment Analysis of Tokopedia Reviews Ulhaq, Afif Langgeng Dhiya; Suprayogi, Suprayogi
Journal of Applied Informatics and Computing Vol. 9 No. 6 (2025): December 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i6.11239

This study presents a comparative evaluation of machine learning models for sentiment analysis on Tokopedia user reviews written in the Indonesian language. The objective is to assess the effectiveness of three algorithms—Support Vector Machine (SVM), Random Forest (RF), and Multilayer Perceptron (MLP)—in classifying customer sentiments extracted from Tokopedia reviews on Google Play Store. The dataset, collected between January and October 2025, consists of 10,236 unique entries after preprocessing, which included text cleaning, case folding, tokenization, stopword removal, normalization using a verified Indonesian word normalization dictionary, and optional stemming with the Sastrawi library. The reviews were divided into positive and negative categories based on rating polarity (4–5 stars as positive; 1–2 stars as negative).Each model was evaluated using both hold-out validation (80:20 split) and 5-fold cross-validation, employing metrics such as accuracy, precision, recall, and F1-score. Experimental results indicate that the SVM achieved the highest accuracy of 0.88, outperforming Random Forest (0.85) and MLP (0.83). These findings demonstrate that SVM performs more robustly on sparse TF-IDF vector features and is more resistant to noise within informal Indonesian expressions. The research further discusses the linguistic challenges inherent in Indonesian sentiment analysis, including code-mixing, abbreviations, and non-standard words, while proposing preprocessing strategies to mitigate them.The outcomes of this study contribute to enhancing the reliability of sentiment-based decision support systems in Indonesian e-commerce platforms. The methodological framework developed here can serve as a baseline for future work involving hybrid or deep-learning approaches such as LSTM or IndoBERT for improved contextual understanding.

Co-Authors Suprayogi Suprayogi

Title

Found 1 Documents
Search

Abstract

Title Search

Found 1 Documents Search

Abstract

Title

Found 1 Documents
Search