This study contains sentiment analysis on Twitter data with the direction of sentiment on the TikTokShop feature. In this study, the k nearest neighbor method is implemented in which the metric distance cosine similarity is used with the value of the nearest neighbor distance k = 3, 5, 7, and 9. In the modeling, a k-fold cross-validation scenario is used with a value of k = 10 fold. This study also uses unigram, bigram, and trigram selection features to handle imbalanced data using undersampling techniques. From the modeling results, it is found that the best modeling is the model with unigram feature selection with nearest neighbor k = 3. From this model, the average accuracy value is 89.92%, the average precision is 90.54% and the recall average is 87.37%. In the test, the results showed that the unigram feature selection had the best performance with 91% accuracy, 92% precision, and 89% recall
Copyrights © 2022