In sentiment classification systems that use Naïve Bayes Classifier, a commonly used feature extraction method is TF-IDF with unigram and bigram, where the two is used separately. In the reality, most of texts contain single or composed word,so it is needed to use the combination of unigram and bigram to maximize the accuracy of the classification results. In this research, the impact and performance improvement between classification systems using unigram or bigram solely and those using a combination of both are studied. Using 1000 data of reviews with ratings 1 (negative) and 5 (positive) from Gojek users on the Google Play Store, and performing performance validation with K-Fold at K=10, the system that uses the combined TF-IDF feature extraction of unigrams and bigrams achieves the best performance among the three systems with an accuracy of 0.84, however the accuracy of the system that uses unigrams solely has accuracy of 0.83, and 0.7 for the system that uses bigram. From the results of the research, it can be concluded that the use of the combination of unigram and bigram can increase the accuracy of the classification result.
Copyrights © 2023