Claim Missing Document
Check
Articles

Found 2 Documents
Search
Journal : Building of Informatics, Technology and Science

Perbandingan Kinerja Naive Bayes, Support Vector Machine dan Random Forest Untuk Analisis Sentimen Aplikasi Brimo Darwin, Amelia; Lestarini, Dinda; Seprina, Iin
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8697

Abstract

The development of financial technology has driven the increasing use of mobile banking, including BRImo, owned by Bank Rakyat Indonesia (BRI). However, user reviews on the Google Play Store show various complaints such as login difficulties, system errors, and failed transactions. This study aims to analyze BRImo user sentiment using three machine learning algorithms: Naive Bayes, Support Vector Machine (SVM), and Random Forest. Data were obtained from 4,996 reviews through web scraping and labeled based on ratings with categories 1-3 negative and 4-5 positive. The labeling process obtained 4,123 positive reviews and 873 negative reviews, which were then balanced using the Synthetic Minority Oversampling Technique (SMOTE). Feature extraction was performed using TF-IDF. Test results showed that Random Forest provided the best performance with an accuracy of 0.87, a recall of 0.70, and an F1-score of 0.65 in the negative class, and an F1-score of 0.92 in the positive class. The macro F1-score reached 0.79, higher than SVM (0.69) and Naive Bayes (0.70). This finding indicates that Random Forest is more effective in classifying BRImo user sentiment, especially after data balancing, and can serve as a reference for developers in improving the quality of application services.
Analisis Sentimen Ulasan Pengguna Aplikasi Sociolla Menggunakan Algoritma Support Vector Machine dengan Optimasi Grid Search Fitriani, Suci; Lestarini, Dinda; Seprina, Iin
Building of Informatics, Technology and Science (BITS) Vol 7 No 3 (2025): December 2025
Publisher : Forum Kerjasama Pendidikan Tinggi

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47065/bits.v7i3.8686

Abstract

The rapid growth of digital technology has driven innovation in the beauty industry, one of which is the Soco by Sociolla platform that provides online product reviews. The increasing number of user reviews offers opportunities for conducting sentiment analysis to understand users’ perceptions of service quality. The main challenge in modeling sentiment for beauty product reviews lies in the use of highly varied, subjective, and informal language, which results in diverse distribution patterns. Therefore, this study not only applies the Support Vector Machine (SVM) algorithm for sentiment classification but also compares two kernels—Linear and Radial Basis Function (RBF)—and evaluates the effect of hyperparameter optimization using Grid Search in the context of beauty e-commerce data. A total of 3,387 reviews were collected and processed through several stages, including text preprocessing, labeling, feature extraction using TF-IDF, data splitting, model training, and evaluation. The results show that the baseline RBF kernel provides the best performance with an accuracy of 88.5%, while the baseline Linear kernel achieves an accuracy of 87.76%. Meanwhile, Grid Search optimization produces an accuracy of 86.22%, indicating that the explored hyperparameter configurations were unable to exceed the performance of the RBF baseline despite delivering stable results during cross-validation. These findings suggest that the linguistic characteristics of beauty reviews are more effectively addressed by non-linear kernels, making them superior to Linear kernels in recognizing non-linear patterns within user review data. Furthermore, the results indicate that hyperparameter optimization does not always lead to increased model accuracy, particularly when the baseline SVM configuration is already performing near optimally for the characteristics of the dataset used.