This research explores sentiment classification and toxicity assessment in cultural documentary videos through a systematic analysis framework based on the Cross-Industry Standard Process for Data Mining (CRISP-DM). The study evaluates the sentiment polarity of viewer comments by utilizing a diverse array of machine-learning algorithms, including k-NN, DT, NBC, and SVM. It identifies toxic language patterns across multiple videos. Additionally, the research employs SMOTE to address class imbalance issues and enhance model performance. The results reveal high accuracy rates ranging from 72.24% to 96.79% in sentiment classification, indicating the effectiveness of the proposed methodology. Moreover, toxicity analysis unveils varying degrees of toxic language prevalence, with toxicity scores ranging from 0.01270 to 0.09334 across different videos. Despite these achievements, the study acknowledges the inherent limitations of toxicity scoring algorithms in capturing contextual nuances. Overall, this research contributes to understanding sentiment dynamics and toxicity trends in cultural documentary content and underscores the importance of employing advanced machine learning techniques within a structured analytical framework for insightful data interpretation and decision-making.
                        
                        
                        
                        
                            
                                Copyrights © 2024