The ability to predict student academic performance at an early stage is crucial for educational institutions to provide timely interventions. This research aims to apply and evaluate the effectiveness of ensemble learning methods in predicting the final grades (G3) of secondary school students using the UCI "Student Performance" public dataset. To prevent data leakage, the models were executed without incorporating historical grade variables (G1 and G2), ensuring the system functions strictly as an Early Warning System. The methodological training process was enhanced by integrating k-fold cross-validation,hyperparameter optimization, and a direct comparison against a baseline model (Linear Regression) to guarantee model robustness and validity. Evaluation results indicate that the XGBoost model achieved the highest performance, yielding an Rsquared ($R^2$) of 0.28. Furthermore, feature importance analysis revealed that accumulated absences and prior class failures are the most significant predictors. As a practical implication, these findings recommend that schools develop proactive early warning dashboards and improve the overall school climate to address the root causes of absenteeism at an early stage.
Copyrights © 2026