This study aims to develop an effective predictive model for identifying students at risk of academic dropout using the Decision Tree and Linear Regression algorithms. The data used are sourced from the public Kaggle dataset Students Dropout and Academic Success, which includes demographic, socioeconomic, and academic performance variables for each semester. The research method includes data preprocessing stages, such as data cleaning, label encoding for categorical variables, numeric feature normalization, and target class adjustment to focus on binary classification, namely Dropout and Graduate. The modeling process is carried out by comparing the performance of the two algorithms using evaluation metrics of accuracy, precision, and recall. The results show that the Decision Tree algorithm has superior performance compared to Linear Regression in mapping non-linear patterns in student data. Feature importance analysis revealed that the number of curricular units in the second semester and tuition payment status are the main predictors of dropout risk. These findings are expected to assist educational institutions in implementing early interventions to improve student academic success.
Copyrights © 2026