JOURNAL OF APPLIED INFORMATICS AND COMPUTING
Vol. 10 No. 1 (2026): February 2026

Optimizing Feature Extraction for Naïve Bayes Sentiment Analysis

Achmad, Achmad (Unknown)
Budiman, Fikri (Unknown)



Article Info

Publish Date
04 Feb 2026

Abstract

The rapid growth of e-commerce platforms such as Tokopedia has generated a large volume of user reviews containing diverse opinions about products and services. These reviews reflect consumer perceptions and provide valuable insights for business decision-making. This study aims to enhance sentiment analysis performance by optimizing the Naïve Bayes algorithm through a comparison of two feature extraction techniques, namely Bag of Words (BoW) and Term Frequency–Inverse Document Frequency (TF-IDF). The dataset consists of 5,400 Tokopedia product reviews obtained from the Kaggle platform, which are categorized into positive and negative sentiments. The research process includes text preprocessing consisting of text cleaning, case folding, tokenization, stopword removal, and stemming, feature extraction using Bag of Words (BoW) and Term Frequency–Inverse Document Frequency (TF-IDF), handling data imbalance using the Synthetic Minority Over-sampling Technique (SMOTE), and model training using the Naïve Bayes. The dataset is divided into 80% training data and 20% testing data, and model performance is evaluated using accuracy, precision, recall, and F1-score. The results show that BoW achieved the highest accuracy of 93%, while TF-IDF reached 83%, indicating that BoW provides more effective feature representation and more stable performance for Naïve Bayes-based sentiment analysis on this dataset.

Copyrights © 2026






Journal Info

Abbrev

JAIC

Publisher

Subject

Computer Science & IT

Description

Journal of Applied Informatics and Computing (JAIC) Volume 2, Nomor 1, Juli 2018. Berisi tulisan yang diangkat dari hasil penelitian di bidang Teknologi Informatika dan Komputer Terapan dengan e-ISSN: 2548-9828. Terdapat 3 artikel yang telah ditelaah secara substansial oleh tim editorial dan ...