TIN: TERAPAN INFORMATIKA NUSANTARA
Vol 6 No 12 (2026): May 2026

Analisis Klasifikasi Sentimen Neobank: Perbandingan Konfigurasi N-Gram pada TF-IDF Menggunakan Naive Bayes dan SVM

Fatha Amin Mujtahid (Universitas Semarang, Semarang)
Badroe Zaman (Universitas Semarang, Semarang)
Galet Guntoro Aji (Universitas Semarang, Semarang)



Article Info

Publish Date
28 May 2026

Abstract

The increasing number of Neobank users in Indonesia has led to a growth in user reviews on the Google Play Store, which can be utilized to assess service satisfaction and user experience. Manual analysis of these reviews is inefficient, prompting the use of automated machine learning approaches. This study evaluates the effect of N-Gram configurations in TF-IDF feature extraction on the performance of sentiment classification of Neobank reviews using Naive Bayes (NB) and Support Vector Machine (SVM). The dataset consists of 3,798 reviews, preprocessed from 5,000 initial entries collected from Google Play Store Indonesia, with 2,385 positive and 1,413 negative reviews labeled based on star ratings. Data were split using stratified five-fold cross-validation to ensure balanced class proportions in each fold. Features were extracted with TF-IDF using three N-Gram configurations: unigram, bigram, and unigram+bigram. Results indicate that N-Gram configuration significantly affects the performance of both models. NB achieved the highest accuracy with unigram (87.65%), while SVM performed best with unigram+bigram (88.61% accuracy and 88.22% F1-score). Bigram consistently yielded the lowest performance due to short and informal reviews producing sparser features. This study concludes that N-Gram selection should align with algorithm characteristics, and SVM with unigram+bigram is the most effective approach for sentiment classification of Neobank reviews in Indonesia.

Copyrights © 2026