This study investigates sentiment analysis on user reviews from Bukalapak, a major Indonesian e-commerce platform, using the Multinomial Naïve Bayes (MNB) classifier. The study focuses on tackling the challenge of data imbalance and the linguistic complexities of Indonesian, such as slang, affixes, and negation, which are common in user reviews. Data was collected through web scraping from Bukalapak's app on the Google Play Store, resulting in a dataset of 19,999 reviews. A structured preprocessing pipeline was employed, including text normalization, tokenization, stopword removal, stemming, and term frequency-inverse document frequency (TF-IDF) weighting to prepare the data. The sentiment analysis results show that the model performs well in categorizing neutral reviews (accuracy 81%), but struggles with positive and negative sentiments due to data imbalance, leading to lower accuracy for these categories. The study highlights the effectiveness of Multinomial Naïve Bayes in large-scale sentiment analysis tasks in the e-commerce domain, particularly for platforms with large volumes of user-generated content. The study also introduces SMOTE (Synthetic Minority Over-sampling Technique) for handling data imbalance and k-fold cross-validation for model evaluation, significantly improving the model’s reliability. The research concludes that sentiment analysis can greatly benefit e-commerce platforms by improving customer service, informing product management decisions, and providing valuable insights for business strategies.
Copyrights © 2025