Sentiment analysis is an approach in natural language processing that aims to identify and categorize user opinions or attitudes towards an entity based on text data. The data used consists of the last 500 uploaded captions obtained through the Phantombuster tool. The analysis stages include data crawling, preprocessing (removal of duplicate and empty data, tokenization, stopword removal, and case folding), printing using the Naïve Bayes algorithm, and visualization of the classification results. Based on the processing results, it was found that the majority of the data was classified as neutral (97.65%), while the rest was divided into positive (1.57%) and negative (0.78%) categories, with a model accuracy of 94%. Although the model accuracy is relatively high, the dominance of the neutral class indicates an imbalance in data distribution (imbalanced data) which can affect the quality of the generalization model.
Copyrights © 2025