Building of Informatics, Technology and Science
Vol 6 No 1 (2024): June 2024

Travel Content Evaluation through Sentiment and Toxicity Analysis using CRISP-DM

Singgalen, Yerik Afrianto (Unknown)



Article Info

Publish Date
30 Jun 2024

Abstract

This research, framed by the CRISP-DM methodology, offers a comprehensive analysis of sentiment and toxicity in digital content, focusing on tourism-related videos. Utilizing advanced machine learning models like VADER and TextBlob for sentiment analysis, as well as APIs such as Detoxify and Perspective for toxicity assessment, the study analyzed 25,361 posts, with 23,292 processed for sentiment and 24,171 for toxicity. Various algorithms, including k-NN, DT, NBC, and SVM, were applied with SMOTE to address data imbalance. The SVM algorithm achieved the highest performance with an accuracy of 54.80% and an F-measure of 66.01%, while others showed lower efficacy. The deployment phase integrated these models for real-time analysis, providing actionable insights into user engagement. Findings emphasize the significant impact of sentiments on brand perception and the necessity of managing toxic behavior for a healthier online environment. Despite limitations such as dataset imbalance and model dependency, the study offers valuable recommendations for content creators, advocating for robust moderation and sentiment-based strategies to enhance user interaction. Future research should include diverse datasets and advanced tools to improve the findings' robustness and applicability. This research contributes to understanding digital content dynamics and provides strategic insights for optimizing content creation and user engagement.

Copyrights © 2024






Journal Info

Abbrev

bits

Publisher

Subject

Computer Science & IT

Description

Building of Informatics, Technology and Science (BITS) is an open access media in publishing scientific articles that contain the results of research in information technology and computers. Paper that enters this journal will be checked for plagiarism and peer-rewiew first to maintain its quality. ...