Student retention is a critical challenge facing higher education institutions, including Ibnu Sina University (UIS), where a significant proportion of students risk not completing their studies. Purpose: This study develops and compares predictive models using Random Forest (RF) and Support Vector Machine (SVM) algorithms to classify student retention into three categories: Active, At-Risk, and Inactive. Methods: Administrative data from 2,389 students across 6 study programs (2021/2022–2023/2024 cohorts) were used, encompassing 18 predictor variables including academic performance (GPA, failed credits), demographic, and socio-economic factors. Class imbalance was handled using SMOTE, and hyperparameter optimization was performed via Grid Search with 5-Fold Cross Validation. Results: RF outperformed SVM across all metrics, achieving accuracy of 92.24%, weighted F1-Score of 92.38%, and macro F1-Score of 82.67%, compared to SVM's 87.63% and 87.79%. Feature importance identified Total Failed Credits (0.2847) and Cumulative GPA (0.2134) as the strongest predictors. Novelty: Unlike prior studies focusing solely on academic data, this research integrates non-academic variables (leave history, parental income) and explicitly addresses class imbalance via SMOTE in a multi-class Indonesian higher education context, providing a practical Early Warning System (EWS) framework.
Copyrights © 2026