Garuda - Garba Rujukan Digital

JOURNAL OF APPLIED INFORMATICS AND COMPUTING

Vol. 10 No. 1 (2026): February 2026

Sinaga, Asra Gretya (Unknown)
Robet, Robet (Unknown)
Pribadi, Octara (Unknown)

Publish Date
04 Feb 2026

Sentiment analysis supports data-driven decisions by turning product reviews into reliable polarity labels. We compare four text representations, TF-IDF, TF-IDF reduced via SVD, Word2Vec (trained from scratch), and a hybrid TF-IDF(SVD-300). Word2Vec, for sentiment classification of Indonesian Shopee product reviews from Kaggle (~2.5k texts). After normalization (with optional emoji handling and Indonesian stemming), ratings are mapped to binary sentiment (≤2 negative, ≥4 positive; 3 discarded). Each representation is evaluated with Logistic Regression, Support Vector Machines (linear/RBF), Naive Bayes, and Random Forest under stratified 5-fold cross-validation. TF-IDF with Logistic Regression (C=1.0) yields the best results (F1-macro = 0.816 ± 0.026; Accuracy = 0.816 ± 0.026), with LinearSVC as a strong runner-up. Word2Vec (scratch) performs lower, consistent with limited data being insufficient to learn stable embeddings, while the hybrid representation offers only modest gains over Word2Vec and does not surpass TF-IDF. These findings indicate that TF-IDF is the most reliable and consistent representation for small, short-text review datasets, and they underscore the impact of feature design on downstream classification performance.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

JOURNAL OF APPLIED INFORMATICS AND COMPUTING

Website

Abbrev

JAIC

Publisher

Politeknik Negeri Batam

Subject

Computer Science & IT

Description

Journal of Applied Informatics and Computing (JAIC) Volume 2, Nomor 1, Juli 2018. Berisi tulisan yang diangkat dari hasil penelitian di bidang Teknologi Informatika dan Komputer Terapan dengan e-ISSN: 2548-9828. Terdapat 3 artikel yang telah ditelaah secara substansial oleh tim editorial dan ...

Article Info

Abstract

Comprehensive Comparison of TF-IDF and Word2Vec in Product Sentiment Classification Using Machine Learning Models

Article Info

Abstract