Kaisalana, Mustafid
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Peningkatan Kinerja Model Naïve Bayes untuk Analisis Sentimen Komentar Terkait “Sound Horeg” Menggunakan SMOTE dan Tuning Parameter Kaisalana, Mustafid; Trisnapradika, Gustina Alfa
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8554

Abstract

The phenomenon of “Sound Horeg” on online platforms has sparked diverse public sentiments, making sentiment analysis an essential tool for understanding public opinion. This study aims to classify user sentiments (positive/negative) related to “Sound Horeg” using the Naïve Bayes algorithm. The dataset used in this research exhibits significant class imbalance, with a predominance of negative sentiments. The methodology involves a series of text preprocessing stages, including case folding, tokenizing, normalization, lexicon-based sentiment labeling, stopword removal, stemming, and duplicate removal. The sentiment labeling process utilizes an Indonesian sentiment lexicon compiled from two sources lexicon_positif.csv and lexicon_negatif.csv containing predefined lists of words with positive and negative sentiment scores based on Indonesian public opinion lexicons. Subsequently, text features are extracted using the Term Frequency–Inverse Document Frequency (TF-IDF) method. To address data imbalance, the Synthetic Minority Oversampling Technique (SMOTE) is applied to the training data to balance the number of positive and negative samples. The Naïve Bayes model is then optimized using GridSearchCV to determine the best alpha value. Experimental results show that the unoptimized Naïve Bayes model achieved an accuracy of 73%, but struggled to classify minority classes (positive sentiments) due to data bias. After applying SMOTE and parameter tuning, the model’s performance improved significantly, demonstrating the effectiveness of these techniques in producing a more balanced and robust model. This study concludes that the Naïve Bayes algorithm, when optimized with SMOTE and hyperparameter tuning, is effective for Indonesian-language sentiment analysis, particularly on imbalanced datasets. Future work may include exploring other algorithms and employing broader sentiment lexicons and more complex linguistic features to further enhance model performance.