Journal of Applied Data Sciences
Vol 5, No 3: SEPTEMBER 2024

Machine Learning Algorithm Optimization using Stacking Technique for Graduation Prediction

Herianto, Herianto (Unknown)
Kurniawan, Bambang (Unknown)
Hartomi, Zupri Henra (Unknown)
Irawan, Yuda (Unknown)
Anam, M Khairul (Unknown)



Article Info

Publish Date
14 Aug 2024

Abstract

Graduating on time is crucial for academic success, impacting time, costs, and education quality. Hang Tuah University Pekanbaru (UHTP) is currently struggling to meet its goal of achieving a 75% on-time graduation rate. This study introduces an innovative approach using machine learning techniques, particularly ensemble learning with Stacking Machine Learning Optuna SMOTE (SMLOS), to address this issue. Our primary objective is to enhance data classification accuracy to predict student graduation timelines effectively. We employ algorithms such as K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Decision Tree (C4.5), Random Forest (RF), and Naive Bayes (NB). These were combined with meta-models, including Logistic Regression (LR), Adaboost, XGBoost, LR+Adaboost, and LR+XGBoost, to create a robust prediction model. To address class imbalance, we applied the Synthetic Minority Over-sampling Technique (SMOTE) and utilized Optuna for hyperparameter tuning. The findings reveal that SMLOS with the Adaboost meta-model achieved the highest accuracy of 95.50%, surpassing previous models' performances, which averaged around 85%. This contribution demonstrates the effectiveness of using SMOTE for class imbalance and Optuna for hyperparameter optimization. Integrating this model into UHTP's academic information system facilitates real-time monitoring and analysis of student data, offering a novel solution for promoting a Smart Campus through more accurate student performance predictions. This technique is not only beneficial for predicting student graduation but can also be applied to various machine learning tasks to improve data classification accuracy and stability.

Copyrights © 2024






Journal Info

Abbrev

JADS

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes ...