Depression is a growing global health concern, particularly among adolescents and university students. Despite the availability of standardized assessments, delays in early detection remain a major barrier to effective treatment. Digital behavioral data holds considerable potential for mental health assessment, but its utilization remains limited due to the absence of integrated and interpretable computational models. This study presents an interpretable machine learning framework for classifying depression risk using multi-domain behavioral features extracted from simulated digital life datasets. Three public datasets were integrated and mapped to five psychological clusters based on DSM-5 criteria: self-regulation, negative affect, cognitive strain, comparison and avoidance, and sleep disturbance. Two ensemble classifiers, Random Forest and XGBoost, were applied and evaluated using 10-fold stratified cross-validation. Depression risk was categorized into three levels: Low, Medium, and High. The Random Forest model achieved the highest accuracy (81%) and macro-averaged F1-score (0.81), showing strong performance especially in identifying transitional Medium-risk users. To enhance transparency, both global and local model interpretations were performed using SHapley Additive exPlanations (SHAP). Results revealed that digital stressors such as excessive screen time and disrupted sleep patterns were prominent in high-risk classifications, while mood stability and mindfulness were protective factors in low-risk groups. The proposed framework offers a scalable and explainable for early depression screening by integrating psychological theory with artificial intelligence methods. The findings contribute to the field of behavioral informatics by demonstrating the practical value of interpretable models in enhancing the reliability, transparency, and applicability of digital mental health systems and personalized behavioral monitoring.
Copyrights © 2026