JTAM (Jurnal Teori dan Aplikasi Matematika)
Vol 9, No 2 (2025): April

Identification of Demographic Factors Affecting Student Performance using Tree-Based Machine Learning Models

Murwaningtyas, Chatarina Enny (Unknown)



Article Info

Publish Date
26 Apr 2025

Abstract

This study aims to identify key academic and demographic factors influencing student performance in the Logic and Set Theory course, particularly in the context of different learning modes during and after the COVID-19 pandemic. It adopts a quantitative exploratory design involving students from the 2020 to 2023 cohorts at Sanata Dharma University. Academic data (exam and assignment scores, course outcomes) and demographic data (e.g., parental education and income, region of origin, gender, and high school major) were collected from the academic system and supplemented via questionnaires. The dataset was cleaned, encoded, and normalized using RobustScaler, with class imbalance addressed through SMOTE. Descriptive statistics were used to explore initial data characteristics. Five tree-based machine learning models, Decision Tree, Random Forest, XGBoost, LightGBM, and CatBoost, were implemented within a pipeline that included preprocessing and model optimization using GridSearchCV with 5-fold cross-validation. Model evaluation employed multiple metrics, including accuracy, precision, recall, F1-score, AUC, and Average Precision. Results showed that XGBoost and CatBoost achieved the best performance (accuracy 92%, AUC 0.99) with balanced precision and recall across all four performance categories. Feature importance analysis indicated that exam and assignment scores were the strongest predictors, while demographic factors such as enrollment year, parental education, and income contributed moderately. Variables like gender, region, and high school major had minimal influence. This research demonstrates how machine learning can effectively integrate academic and demographic data, rather than analyzing them in isolation, to uncover nuanced patterns in student achievement. The findings support the development of data-driven educational interventions, such as preparatory learning modules, peer mentoring for underperforming groups, targeted academic advising for students from low-income or less-educated families, and flexible instructional strategies for cohorts affected by pandemic-related disruptions. 

Copyrights © 2025






Journal Info

Abbrev

jtam

Publisher

Subject

Mathematics

Description

Jurnal Teori dan Aplikasi Matematika (JTAM) dikelola oleh Program Studi Pendidikan Matematika FKIP Universitas Muhammadiyah Mataram dengan ISSN (Cetak) 2597-7512 dan ISSN (Online) 2614-1175. Tim Redaksi menerima hasil penelitian, pemikiran, dan kajian tentang (1) Pengembangan metode atau model ...