This study compares the performance of traditional machine learning methods and sequential deep learning models for text-based spam classification. The primary issue addressed is the lack of consistent, fair evaluation across these approaches due to variations in datasets, preprocessing techniques, and experimental settings across previous studies. To overcome this limitation, this research proposes a controlled comparative evaluation framework by employing a unified dataset, standardized preprocessing procedures, consistent data splitting, and identical evaluation metrics. The dataset used consists of 5,572 messages with an imbalanced class distribution; therefore, oversampling was applied to the training data to mitigate bias. The evaluated models include TF-IDF-based Logistic Regression as the baseline, as well as Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTM), and Gated Recurrent Units (GRUs) as deep learning models.
Copyrights © 2026