Claim Missing Document
Check
Articles

Found 2 Documents
Search
Journal : JOURNAL OF APPLIED INFORMATICS AND COMPUTING

Optimizing Feature Extraction for Naïve Bayes Sentiment Analysis Achmad, Achmad; Budiman, Fikri
Journal of Applied Informatics and Computing Vol. 10 No. 1 (2026): February 2026
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v10i1.12041

Abstract

The rapid growth of e-commerce platforms such as Tokopedia has generated a large volume of user reviews containing diverse opinions about products and services. These reviews reflect consumer perceptions and provide valuable insights for business decision-making. This study aims to enhance sentiment analysis performance by optimizing the Naïve Bayes algorithm through a comparison of two feature extraction techniques, namely Bag of Words (BoW) and Term Frequency–Inverse Document Frequency (TF-IDF). The dataset consists of 5,400 Tokopedia product reviews obtained from the Kaggle platform, which are categorized into positive and negative sentiments. The research process includes text preprocessing consisting of text cleaning, case folding, tokenization, stopword removal, and stemming, feature extraction using Bag of Words (BoW) and Term Frequency–Inverse Document Frequency (TF-IDF), handling data imbalance using the Synthetic Minority Over-sampling Technique (SMOTE), and model training using the Naïve Bayes. The dataset is divided into 80% training data and 20% testing data, and model performance is evaluated using accuracy, precision, recall, and F1-score. The results show that BoW achieved the highest accuracy of 93%, while TF-IDF reached 83%, indicating that BoW provides more effective feature representation and more stable performance for Naïve Bayes-based sentiment analysis on this dataset.
Comparison of Random Forest and LSTM for Tokopedia Sentiment Analysis Saputra, Fahrizal Denta; Budiman, Fikri
Journal of Applied Informatics and Computing Vol. 10 No. 1 (2026): February 2026
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v10i1.12042

Abstract

Tokopedia is one of the largest e-commerce platforms in Indonesia, where every transaction generates user reviews containing opinions about the products or services received. These reviews provide important information about product quality, but the very large quantity makes manual analysis inefficient. This study aims to automatically classify Tokopedia review sentiment and compare the performance of machine learning and deep learning methods. The dataset used was obtained from Kaggle and has undergone an initial cleaning stage, including removing irrelevant columns and manually labeling into two sentiment classes, positive and negative. The research methodology includes several stages, namely data preprocessing (cleaning, case-folding, stopword removal, tokenization, normalization, and stemming), feature extraction using TF-IDF for Random Forest and word embedding for LSTM, implementation of Random Forest and Long Short-Term Memory (LSTM) models, and model evaluation using confusion matrix. Experimental results show that LSTM provides the best performance with 94% accuracy, while Random Forest achieves 92% accuracy. These findings indicate that LSTM is more effective in understanding language context, resulting in more accurate sentiment classification and is useful for decision making in the e-commerce field.