BAREKENG: Jurnal Ilmu Matematika dan Terapan
Vol 19 No 2 (2025): BAREKENG: Journal of Mathematics and Its Application

SENTIMENT ANALYSIS OF REVIEWS ON X APPS ON GOOGLE PLAY STORE USING SUPPORT VECTOR MACHINE AND N-GRAM FEATURE SELECTION

Kusumo, Fahri Aimar (Unknown)
Saputro, Dewi Retno Sari (Unknown)
Widyaningsih, Purnami (Unknown)



Article Info

Publish Date
01 Apr 2025

Abstract

Sentiment analysis is an application of text mining that is used to find out opinions from a set of textual data about a particular event or topic. The main function of sentiment analysis is to extract information and find the meaning and opinions of a given user. Sentiment analysis requires classification algorithms, such as Support Vector Machine (SVM). SVM is a frequently used algorithm for text data classification because it can handle high-dimensional data. The concept of SVM is to determine the best hyperplane that serves as a separator of two classes in the input space. Text data with a large number of features causes data imbalance and affects the classification process so it is necessary to do feature selection. Feature selection is a technique used to reduce irrelevant attributes in the dataset. N-gram feature selection is a statistics-based approach to classifying text. N-grams are able to classify unknown text with the highest certainty. The characteristics of N-grams in sentiment analysis are that they function well despite textual errors, run efficiently, require simple storage, and fast processing time. This research aims to perform sentiment analysis on application reviews on the Google Play Store with SVM and unigram, bigram, and trigram feature selection. The methodology of this research includes conducting theoretical studies, web scraping, text preprocessing, labeling sentiments with VADER, weighting with TF-IDF, dividing data into training data (80%) and testing data (20%), training and evaluating models, classifying testing data, and interpreting results. Based on the research results, 3151 testing data were classified. SVM classification and unigram feature selection have the highest accuracy value of 90% and AUC of 0.93 (excellent). SVM classification and bigram feature selection have an accuracy value of 78% with an AUC value of 0.81 (good). SVM classification and trigram feature selection had the lowest accuracy value of 68% with an AUC value of 0.66 (poor).

Copyrights © 2025






Journal Info

Abbrev

barekeng

Publisher

Subject

Computer Science & IT Control & Systems Engineering Economics, Econometrics & Finance Energy Engineering Mathematics Mechanical Engineering Physics Transportation

Description

BAREKENG: Jurnal ilmu Matematika dan Terapan is one of the scientific publication media, which publish the article related to the result of research or study in the field of Pure Mathematics and Applied Mathematics. Focus and scope of BAREKENG: Jurnal ilmu Matematika dan Terapan, as follows: - Pure ...