Scientific Journal of Informatics
Vol. 11 No. 3: August 2024

Analysis of Student Graduation Prediction Using Machine Learning Techniques on an Imbalanced Dataset: An Approach to Address Class Imbalance

Hermanto, Dedy (Unknown)
Desy Iba Ricoida (Unknown)
Desi Pibriana (Unknown)
Rusbandi (Unknown)
Muhammad Rizky Pribadi (Unknown)



Article Info

Publish Date
05 Aug 2024

Abstract

Purpose: Machine learning is a key area of artificial intelligence, applicable in various fields, including the prediction of timely graduation. One method within machine learning is supervised learning. However, the results are influenced by the distribution of data, particularly in the case of imbalanced classes, where the minority class is significantly smaller than the majority class, affecting classification performance. Timely graduation from a university is crucial for its sustainability and accreditation. This research aims to identify a suitable method to address the issue of predicting timely graduation by managing class imbalance using SMOTE (Synthetic Minority Oversampling Technique). Methods: This study uses a five-year dataset with 26 attributes and 1328 records, including status labels. The preprocessing stages involve applying five classification algorithms: Decision Tree (DT), Naive Bayes (NB), Logistic Regression (LR), K-Nearest Neighbors (KNN), and Random Forest (RF). Each algorithm is used both with and without SMOTE to handle the class imbalance. The dataset indicates that 60.84% of the cases represent timely graduations. To mitigate the imbalance, over/under-sampling methods are employed to balance the data. The evaluation metric used is the confusion matrix, which assesses the classification performance. Result: Without SMOTE, the accuracies were 89.12% for DT, 79.65% for NB, 89.47% for LR, 87.72% for KNN, and 90.88% for RF. With SMOTE, the accuracies were 88.89% for DT, 81.48% for NB, 91.05% for LR, 92.59% for KNN, and 89.81% for RF. The algorithms NB, LR, and KNN showed improvement with SMOTE, with KNN yielding the best results. Novelty: Based on the comparison results, a comparison of five algorithms with and without SMOTE can reasonably classify several of the algorithms being compared.

Copyrights © 2024






Journal Info

Abbrev

sji

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management Electrical & Electronics Engineering Engineering

Description

Scientific Journal of Informatics (p-ISSN 2407-7658 | e-ISSN 2460-0040) published by the Department of Computer Science, Universitas Negeri Semarang, a scientific journal of Information Systems and Information Technology which includes scholarly writings on pure research and applied research in the ...