This research explores the dynamics of backpacker tourism in Indonesia by analyzing online content from various regions, including Bandung, Dieng, Borobudur, Ijen, Bromo, Tumpak Sewu, Malang, Banyuwangi, and Bali. Using the Digital Content Reviews and Analysis Framework, the study systematically processed user-generated content to assess sentiment and toxicity levels. The analysis revealed that while most interactions were non-toxic, there were occasional spikes in harmful language, particularly in the categories of profanity and identity attacks. For example, toxicity scores in Malang, Banyuwangi, and Bali averaged 0.06995, with peaks reaching 0.78207, underscoring the need for ongoing content moderation. In addition, the study employed a Support Vector Machine (SVM) model enhanced by SMOTE to handle class imbalance. The model achieved an accuracy of 82.64% and a recall rate of 97.39%, demonstrating its effectiveness in identifying positive cases with minimal false negatives. The AUC scores, ranging from 0.970 to 0.979, indicated strong discriminatory power. These findings highlight the potential of using machine learning models to analyze large-scale, imbalanced datasets in tourism-related research. Overall, this study provides valuable insights into traveler perceptions of Indonesia’s backpacker destinations, emphasizing the importance of context in understanding online discourse. The integration of toxicity analysis and SVM modeling offers practical implications for improving tourism management, content moderation, and promoting sustainable tourism practices.
Copyrights © 2024