Sentiment analysis is one of the important methods in understanding public opinion from large amounts of text, such as product reviews or user comments. Many studies have shown that the BERT (BiDirectional Encoder Representations from Transformers) model has advantages over classical machine learning models such as Support Vector Machine (SVM) and Naïve Bayes. However, there are still few studies that systematically compare the performance of the two on datasets from various topics and languages, especially those with imbalanced label distributions. This study compares four BERT variants (bert-base-uncased, distilbert-base-uncased, indobert-base-uncased, and distilbert-base-indonesian) with two classical models using three datasets of IMDb 50K (English), Amazon Food Reviews (English), and Gojek App Review (Indonesian). The classical model uses the TF-IDF vectorisation method, while the BERT model is optimised through a further training process (fine-tuning) with a layer freezing technique. The evaluation is carried out using accuracy, precision, recall, and F1-score. The results show that the BERT model excels on English data, while on imbalanced Indonesian data, SVM and Naïve Bayes produce higher F1-score results. These findings indicate that the selection of the right model must be adjusted to the characteristics of the data used.
Copyrights © 2025