Claim Missing Document
Check
Articles

Found 2 Documents
Search
Journal : Journal of Applied Data Sciences

Optimizing Sentiment Analysis on Imbalanced Hotel Review Data Using SMOTE and Ensemble Machine Learning Techniques Putra, Pandu Pratama; Anam, M. Khairul; Chan, Andi Supriadi; Hadi, Abrar; Hendri, Nofri; Masnur, Alkadri
Journal of Applied Data Sciences Vol 6, No 2: MAY 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i2.618

Abstract

This research addresses the challenge of imbalanced sentiment classes in hotel review datasets obtained from Traveloka by integrating SMOTE (Synthetic Minority Oversampling Technique) with ensemble machine learning methods. The study aimed to enhance the classification of Positive, Negative, and Neutral sentiments in customer reviews. Data preprocessing techniques, including tokenization, stemming, and stopword removal, prepared the textual data for analysis. Various machine learning models—CART, KNN, Naive Bayes, and Random Forest—were evaluated individually and in ensemble configurations such as Bagging, Stacking, Soft Voting, and Hard Voting. The Stacking ensemble approach, utilizing Logistic Regression as a meta-classifier, demonstrated superior performance with an accuracy, precision, recall, and F1-score of 88%, outperforming Bagging (86%), Hard Voting (84%), and Soft Voting (81%). The findings highlight the effectiveness of SMOTE in balancing sentiment classes, particularly improving the classification of underrepresented Neutral and Negative categories. The novelty of this study lies in the comprehensive use of ensemble techniques combined with SMOTE, which significantly enhanced prediction stability and accuracy compared to previous approaches. These results provide valuable insights into leveraging advanced machine learning techniques for sentiment analysis, offering practical implications for improving customer experience and service quality in the hospitality industry.
Enhancing the Performance of Machine Learning Algorithm for Intent Sentiment Analysis on Village Fund Topic Anam, M. Khairul; Putra, Pandu Pratama; Malik, Rio Andika; Karfindo, Karfindo; Putra, Teri Ade; Elva, Yesri; Mahessya, Raja Ayu; Firdaus, Muhammad Bambang; Ikhsan, Ikhsan; Gunawan, Chichi Rizka
Journal of Applied Data Sciences Vol 6, No 2: MAY 2025
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v6i2.637

Abstract

This study explores the implementation of Intent Sentiment Analysis on Twitter data related to the Village Fund program, leveraging Multinomial Naïve Bayes (MNB) and enhancing it with Synthetic Minority Over-sampling Technique (SMOTE) and XGBoost (XGB). The analysis categorizes tweets into six labels: Optimistic, Pessimistic, Advice, Satire, Appreciation, and No Intent. Initially, the MNB model achieved an accuracy of 67% on a 90:10 data split. By applying SMOTE, accuracy improved by 12%, reaching 89%. However, adding Chi-Square feature selection did not increase accuracy further. Incorporating XGB into the MNB+SMOTE model led to a 6% improvement, achieving a final accuracy of 95%. Comprehensive model evaluation revealed that the MNB+SMOTE+XGB model achieved 96% accuracy, 96% precision, 96% recall, and a 96% F1-score, with an AUC of 99%, categorizing it as excellent. These findings demonstrate that the combination of SMOTE for addressing class imbalance and XGBoost for boosting performance significantly enhances the MNB model's classification capabilities. The novelty lies in the integration of these techniques to improve intent sentiment classification for public opinion analysis on the Village Fund program. The results indicate that the majority of tweets labeled as "No Intent" reflect a lack of specific sentiment or actionable intent, providing valuable insights into public perception of the program.