Introduction: Accurate grade classification in education is essential for early intervention and performance assessment. This study presents a comparative analysis of Random Forest and Neural Networks in classifying student grades using a dataset of 2,392 high school students. The aim is to evaluate both models’ predictive performance and interpretability in an educational data mining context. Methods: The dataset, containing academic and demographic features, was pre-processed by handling missing values, encoding categorical variables, and scaling numerical features. Grades were categorized into five classes: A, B, C, D, and F. Both models were implemented using Python and evaluated with metrics including accuracy, precision, recall, and F1-score. Hyperparameter tuning was performed via Grid Search with cross-validation to optimize performance. Results: The Random Forest model achieved a baseline accuracy of 70.2%, outperforming Neural Networks at 69.1%. After tuning, Random Forest improved to 71.45% accuracy, while Neural Networks reached 70.49%. Both models demonstrated strong precision and recall in identifying failing students (class F), with F1-scores of 0.90 and 0.89, respectively. However, classification of mid-range grades (A to D) remained challenging due to class overlap. Feature importance analysis highlighted interpretability advantages in the Random Forest model. Conclusions: Both models are effective for grade classification, with Random Forest offering slightly better accuracy and interpretability. Neural Networks, while slightly less accurate, capture nonlinear relationships effectively post-tuning. The results suggest that model selection should be guided by context-specific needs, balancing performance with transparency. Future work may include ensemble techniques and expanded feature sets to improve classification robustness.
Copyrights © 2025