The secondary school period is a crucial time for the development of students' academic and social performance. Educational data mining (EDM) has emerged as a strategic method capable of exploring patterns in educational data to predict academic performance based on various factors, including students' personalities. However, the imbalance in educational data remains an issue that can lead to bias in predictive models. This study aims to identify the factors contributing to the academic performance in mathematics of junior high school students, such as academic, demographic, and Big Five personality factors. The Random Forest method and SMOTE oversampling technique are employed to identify components that contribute to students' academic performance and to enhance the performance of the predictive model. The research indicates that academic factors are significant, while socio-economic and personality factors are less significant in relation to academic performance. Additionally, the application of the SMOTE technique proves effective in addressing data imbalance, and the Random Forest model demonstrates optimal performance with appropriate tuning. The combination of Random Forest, hyperparameter tuning using GridSearchCV, and SMOTE successfully develops a model with an accuracy rate of 99%.
Copyrights © 2025