This study explores the effectiveness of various deep learning models for detecting spam in YouTube comments. Six models were evaluated: Multilayer Perceptron (MLP), Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), Bidirectional LSTM (BiLSTM), Gated Recurrent Unit (GRU), and Attention mechanisms. The dataset consists of 1,956 real comments extracted from popular YouTube videos, representing both spam and legitimate messages. The preprocessing phase involved tokenization and padding of text sequences to prepare them for model input. Results reveal that the LSTM model achieved the highest test accuracy of 95.65%, outperforming other models by capturing sequential dependencies and context within comments. The CNN model also demonstrated high accuracy, underscoring the importance of local pattern recognition in text classification. While BiLSTM and Attention models offered comparable performance, their marginal improvement over LSTM indicates that sequential modeling plays a crucial role in this task. The GRU model, despite being computationally efficient, showed slightly lower accuracy compared to LSTM and BiLSTM. The MLP model, serving as a baseline, exhibited limited performance, emphasizing the need for advanced architectures in spam detection. These findings suggest that combining sequential modeling with local feature extraction could lead to more robust spam detection systems.
Copyrights © 2024