YouTube comments contain rich emotional expressions, but their large volume makes manual analysis inefficient. This study proposes a multiclass emotion classification approach for Indonesian YouTube comments using the IndoBERT model integrated with a database-driven incremental learning system. Comment data were collected through the YouTube Data API and labeled into six emotion categories: anger, sadness, happiness, fear, surprise, and neutral. Text preprocessing included lowercasing, text cleaning, and normalization of informal Indonesian words. The model was fine-tuned using three training–testing split scenarios (60:40, 70:30, and 80:20). The results show that the 80:20 split achieved the highest accuracy of 68%, influenced by an imbalanced class distribution with underrepresented minority classes. In addition, the system supports continuous data storage and incremental retraining, allowing the model to learn from new data without retraining from scratch. This adaptive mechanism makes the proposed system suitable for long-term emotion analysis on YouTube comments.
Copyrights © 2025