This study aims to analyze the sentiment of user reviews of the ChatGPT application on the Google Play Store, a platform that directly reflects public opinion toward this increasingly popular artificial intelligence application. A total of 10,000 reviews were collected through web scraping and underwent a series of rigorous preprocessing stages. These stages included data cleaning to remove noise, case folding to standardize text, tokenizing to break sentences into words, normalization to standardize informal words, and stopword removal to eliminate common but uninformative words—ensuring optimal data quality. Feature weighting was then performed using the Term Frequency-Inverse Document Frequency (TF-IDF) method with three n-gram scenarios (Unigram, Unigram+Bigram, Unigram+Trigram), followed by feature selection using Chi-Square to identify the most relevant features. The processed and weighted data were then classified using two machine learning algorithms: K-Nearest Neighbors (KNN) and Decision Tree. The evaluation results show that the Decision Tree model with Unigram+Bigram features achieved the highest accuracy of 0.8089 (80.89%) and an F1-Score of 0.8894 (88.94%), making it the best-performing model in this study. These findings provide valuable insights for application developers to better understand user perceptions, identify areas for improvement, and enhance the quality of ChatGPT services in the future, especially when addressing the challenge of imbalanced review data.
Copyrights © 2025