This study investigates the performance gap between two automatic sentiment labeling strategies one relying on star ratings and the other derived from textual content in classifying application reviews using the K-Nearest Neighbor (KNN) algorithm. Each review is converted into TF-IDF vectors, and the influence of both labeling approaches on the resulting classifier is examined. Performance is evaluated using accuracy, precision, recall, and F1-score to ensure a comprehensive assessment, with the content-based method achieving an accuracy of 0.81, indicating a more reliable outcome than the score-based variant. The score-driven approach shows weaker consistency, largely due to mismatches between numerical ratings and the sentiment conveyed in written text. Despite these findings, the study is limited by its focus on a single application domain and its reliance on a single classical baseline classifier, which may be sensitive to class imbalance. Future work is encouraged to incorporate more diverse datasets, adopt modern text representation techniques such as word embeddings or transformer-based encodings, and explore classification algorithms that better accommodate uneven class distributions.
Copyrights © 2026