The increasing popularity of skincare products for acne-prone skin had led to a surge in online consumer reviews, which are characterized by informal language, domain-specific terminology, and imbalanced sentiment distribution, posing challenges for sentiment classification tasks. This study aims not only to compare the performance but also to analyze the generalization behavior of two popular machine learning algorithms, Naïve Bayes and Support Vector Machine (SVM), for sentiment classification of skincare product reviews specifically targeting acne-prone skin. A comprehensive methodology was employed, including thorough text preprocessing, feature extraction using Term Frequency-Inverse Document Frequency (TF-IDF) with n-gram representation, and data balancing through Synthetic Minority Over-sampling Technique (SMOTE). The study utilized a dataset of 4,004 labeled reviews categorized into positive and negative sentiments. The models were evaluated using stratified 5-Fold cross-validation to ensure robust and fair assessment. Results indicate that Naïve Bayes slightly outperforms SVM on the testing set, achieving the highest accuracy of 91.14% compared to 90.64% for SVM. While SVM demonstrated higher performance during training, its testing performance suggested a tendency toward overfitting, whereas Naïve Bayes exhibited more stable generalization on unseen data. Further qualitative insight analysis revealed that product effectiveness and user experience are the primary drivers of consumer sentiment, while competitive analysis highlighted distinct brand perception patterns across skincare categories. These findings indicate that simpler probabilistic models such as Naïve Bayes can provide robust and reliable performance for sentiment analysis in specialized and imbalanced skincare review datasets.
Copyrights © 2026