Tokopedia one of Indonesia's largest e-commerce platforms, offers a wide range of products with diverse customer reviews. These reviews reflect consumer opinions and provide valuable insights for service improvement and marketing strategies. Sentiment analysis is crucial for understanding customer perceptions, but processing large-scale, high-dimensional text data remains a challenge, impacting model efficiency and accuracy. This research uses Principal Component Analysis (PCA) to reduce data dimensionality without losing important information for sentiment classification. The study begins by collecting Tokopedia product reviews and preprocessing the text, including data cleaning, tokenization, stopword removal, and stemming. The reviews are then converted into numerical vectors using the Term Frequency-Inverse Document Frequency (TF-IDF) method. A Gaussian Naïve Bayes model is employed to classify sentiment into three categories: positive, neutral, and negative. The results demonstrate that PCA significantly improves model accuracy from 63.13% to 70.47%, with gains in precision (71.85%), recall (70.47%), and F1-score (71.06%). This research contributes to enhancing sentiment analysis techniques using PCA for Tokopedia reviews and offers a valuable approach that can be applied to other e-commerce platforms.
Copyrights © 2025