College students' mental health is a critical issue that is gaining increasing attention, particularly regarding depression, which significantly impacts quality of life and academic achievement. This study aims to develop a predictive model for depression in college students based on psychosocial data using the Random Forest algorithm. The data used is a public secondary dataset from Kaggle with 1,000 samples, covering demographic variables, lifestyle, and psychological indicators. The analysis process included data preprocessing, class balancing, model training, and evaluation using accuracy, precision, recall, F1-score, and confusion matrix metrics. Test results showed that the Random Forest model was able to predict depression with 87.0% accuracy, 86.1% precision, 87.4% recall, and 86.7% F1-score, demonstrating good and stable performance. Word cloud visualization identified academic pressure, stress, and anxiety as dominant factors. Compared to previous research using the SVM algorithm, Random Forest demonstrated improved performance, particularly in handling complex and imbalanced data. This study confirms the effectiveness of the Random Forest-based machine learning approach in supporting the early detection of college students' depression and provides a foundation for the development of mental health monitoring systems in higher education settings.
Copyrights © 2026