Student stress in higher education is a multifaceted phenomenon influenced by psychological, academic, and environmental factors, with significant implications for students’ mental health and academic performance. While previous studies have examined stress determinants using traditional statistical approaches, such methods often fail to capture complex, non-linear interactions among multiple stressors and provide limited insight into their relative importance. This study aims to identify and rank the key determinants of student stress using regression-based machine learning models. A structured dataset comprising 1,100 student observations and 21 predictor variables was analyzed. Four regression models Linear Regression, Ridge Regression, Gradient Boosting Regressor, and Random Forest Regressor were evaluated using 5-fold cross-validation and a holdout test set. Model performance was assessed using R², RMSE, and MAE metrics. The Random Forest Regressor demonstrated the best performance, achieving a test R² of 0.812, indicating strong predictive accuracy and generalization capability. Feature importance analysis using permutation importance and model-specific measures revealed that bullying was the most influential determinant of student stress, followed by extracurricular activities, self-esteem, and sleep quality. Environmental factors such as safety and basic needs also showed notable contributions. The consistency between feature importance methods confirms the robustness of the findings. This study contributes to the literature by providing an integrated and interpretable machine learning framework for identifying dominant stress determinants, offering valuable insights to support data-driven mental health interventions and policy development in higher education.
Copyrights © 2026