This study aims to compare and analyze the performance of several algorithms in classifying student stress levels. The dataset used in this research is the Student Lifestyle Dataset obtained from the Kaggle repository, consisting of 2,000 records with eight student lifestyle features. The methods employed include the implementation of three classification algorithms: Logistic Regression, K-Nearest Neighbors, and Support Vector Machine (SVM), across four experimental scenarios. These scenarios include a baseline model, handling imbalanced data using the Synthetic Minority Oversampling Technique (SMOTE), feature selection using Recursive Feature Elimination (RFE), and hyperparameter tuning. The results were evaluated using accuracy, precision, recall, and F1-score metrics. Furthermore, interpretability analysis of the best-performing model was conducted using SHAP. The findings indicate that the integration of data balancing techniques, feature selection, and parameter optimization with the SVM algorithm significantly improved performance, achieving an accuracy of 0.98, precision of 0.96, recall of 0.98, F1-score of 0.97, and a computation time of 0.011 seconds. The interpretability analysis revealed that lifestyle factors such as study duration and sleep duration had the most dominant influence on stress levels. These results demonstrate that the integrated optimization strategy successfully supports fast and accurate detection of student stress levels.
Copyrights © 2026