This study investigates the performance of a sentiment classification model leveraging IndoBERT to analyze Indonesian hotel review data. Sentiment analysis is crucial for extracting actionable insights from customer reviews, yet challenges such as linguistic diversity and imbalanced datasets complicate accurate classification. The dataset comprises 90% Positive, 5% Neutral, and 5% Negative sentiments, reflecting significant class imbalance. A fine-tuned IndoBERT model was trained over three epochs, with performance assessed using metrics such as accuracy, precision, recall, F1-score, confusion matrices, and ROC and Precision-Recall curves. The results indicate high global accuracy (92.52%) and robust performance for the Positive class (F1-score: 96.09%, AUC: 0.90). However, significant limitations were observed for minority classes, with the Neutral class achieving precision, recall, and F1-scores of 0.00, and the Negative class obtaining a low F1-score of 28.57%. These findings underscore the influence of dataset imbalance, where the dominance of the Positive class skews model predictions. Future research should explore techniques such as oversampling SMOTE, reweighting loss functions, or hybrid architectures to mitigate imbalance and improve performance across all sentiment categories. This research contributes to advancing sentiment classification methodologies for Indonesian text, offering practical implications for enhancing customer feedback analysis in the hospitality industry.
Copyrights © 2025