This study aims to analyze and compare the performance of three sentiment classification algorithms—Support Vector Machine (SVM), Naïve Bayes (NB), and K-Nearest Neighbor (K-NN)—in classifying user reviews of the Satu Sehat application. The data preprocessing stage involves several steps, including text cleaning through normalization, removal of punctuation, numbers, and irrelevant characters, as well as the elimination of stopwords. Subsequently, stemming is performed to reduce words to their root forms. Feature extraction is conducted using the CountVectorizer method with a bag-of-words approach, which converts textual data into numerical representations. The dataset is then divided into training and testing subsets using an 80:20 train-test split ratio. Model performance is evaluated through a confusion matrix, producing key evaluation metrics such as accuracy, precision, recall, and F1-score. Based on the results of testing 9,192 user reviews, the SVM algorithm with a linear kernel demonstrated the best overall performance compared to NB and K-NN, as indicated by the highest accuracy score. These findings suggest that SVM is more effective in handling high-dimensional textual features, making it a highly suitable algorithm for sentiment analysis of digital health application reviews, particularly those related to Satu Sehat.
Copyrights © 2026