Garuda - Garba Rujukan Digital

International Journal of Advances in Artificial Intelligence and Machine Learning

Vol. 2 No. 2 (2025): International Journal of Advances in Artificial Intelligence and Machine Learni

Wang, Gunawan (Unknown)
Jaber, Mustafa Musa (Unknown)

Publish Date
16 Jun 2025

Background of study: Background of study: The impact of online reviews on consumer behavior is especially relevant in the hospitality industry, and the sentiment corresponding to these reviews is difficult to determine due to the subjectivity involved in the reviews, disparate writing styles, and the noticeable class imbalance resulting from the positive reviews outnumbering the negative and neutral ones. Standard machine learning approaches are biased toward the majority class and do not address these problems well.Aims and scope of paper: The present research uses BERT and LSTM deep learning models to perform classification of customer reviews for hotels into three categories: positive, neutral, and negative. The main focus of the research is to analyze the performance of the models concerning sentiment prediction and the handling of the data imbalance problem and to benchmark the models with and without the use of under-sampling.Methods: The dataset comprising of 20,000 reviews from the TripAdvisor platform was pre processed in various ways including the removal of stop words/special characters, tokenization, stemming, and lemmatization. The customer reviews were assigned star ratings, which were aggregated into categories of 4-5 stars as positive, 3 stars as neutral, and 1-2 stars as negative. Random under-sampling was used to the positive class to achieve balance in the dataset. The BERT (bert-base-uncased) and LSTM models were prepared with what was assumed to be a final train-validation split of 80:20, and were evaluated based on standard metrics of accuracy, precision, recall, and rel F1 score, and with a cross-validation of 5 folds.Result: Without the use of under-sampling, BERT achieved the best overall performance with an accuracy of 0.86 and an F1 score of 0.93 for the positive sentiment class and an F1 score of 0.79 in the negative sentiment class. However, all models struggled with neutral sentiments (BERT F1-score: 0.43, LSTM: 0.25). Under-sampling improved neutral class recall (BERT: 0.79) but decreased overall accuracy (BERT: 0.73; LSTM: 0.67) and positive class precision.Conclusion: BERT generally outperforms LSTM for hotel review sentiment analysis, particularly with imbalanced data. While under-sampling helps address class imbalance by improving neutral recall, it incurs significant performance trade-offs, reducing overall accuracy and precision in majority classes due to information loss. Future work should explore advanced resampling (SMOTE, ADASYN) or transfer learning (RoBERTa, XLNet) for better balance and neutral sentiment classification.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

International Journal of Advances in Artificial Intelligence and Machine Learning

Website

Abbrev

ijaaiml

Publisher

CV Media Inti Teknologi

Subject

Computer Science & IT

Description

The International Journal of Advances in Artificial Intelligence and Machine Learning (IJAAIML) is a prominent academic journal dedicated to publishing cutting-edge research and developments in the fields of Artificial Intelligence (AI) and Machine Learning (ML). It serves as an essential platform ...

Article Info

Abstract

A Deep Learning Approach to Sentiment Analysis of Hotel Reviews: Comparing BERT and LSTM Models

Article Info

Abstract