This study aims to classify expressions of language anxiety in English as a foreign language, as reflected in user-generated texts on Twitter. The research applies machine learning approaches Support Vector Machine (SVM) and Convolutional Neural Network (CNN) to perform automatic classification of anxiety levels. The dataset was collected through Twitter crawling, filtered for relevance, and annotated manually using a three-point scale (low, medium, high) based on psychological indicators such as fear of speaking, avoidance, and self-perceived inability. Preprocessing included text normalization, tokenization, stopword removal, and feature extraction using TF-IDF with unigram to trigram representations. Model training was conducted on a balanced dataset, and performance was evaluated through cross-validation and tuning of key hyperparameters. SVM achieved the highest accuracy of 98.40%, showing strong stability across various test conditions. CNN initially performed competitively but experienced a slight performance drop after tuning, suggesting its sensitivity to parameter settings and data volume. The findings demonstrate that SVM is more robust and suitable for limited-data environments, making it a practical tool for classifying psychological traits like language anxiety in digital communication. This research offers insight into the potential of machine learning in psychological and linguistic analysis, especially through social media platforms.
Copyrights © 2025