This research presents a comprehensive approach to analyzing digital content by integrating toxicity analysis, clustering techniques, and Social Network Analysis (SNA) to understand online interactions better. The study finds that, while the average toxicity levels are relatively low, with scores such as 0.06355 for toxicity and 0.00468 for severe toxicity, there are significant spikes, reaching maximum scores of 0.82996 for toxicity and 0.89494 for profanity. These spikes highlight the necessity for continuous monitoring and adaptive moderation strategies to minimize the impact of harmful language. Clustering methods, including K-Means, HDBScan, and Gaussian Mixture models, provide deep insights into the thematic structure of viewer discourse, identifying both prevalent and niche topics. The Gaussian Mixture model identified ten distinct clusters, while HDBScan revealed varying cluster densities, reflecting the diverse range of discussions within the community. In addition, SNA, with 1,716 nodes and 37 edges, offers critical insights into the relational dynamics of the network, pinpointing key influencers and mapping the flow of information between different user groups. By synthesizing these methodologies, the research provides a robust framework for understanding the content and context of digital interactions, facilitating more effective strategies for enhancing community engagement, mitigating toxicity, and promoting a healthier, more inclusive online environment.
Copyrights © 2024