UNP Journal of Statistics and Data Science
Vol. 2 No. 3 (2024): UNP Journal of Statistics and Data Science

Classification of Dropout Rates in West Sumatra Using the Random Forest Algorithm with Synthetic Minority Oversampling Technique

Anita Fadila (Unknown)
Syafriandi Syafriandi (Unknown)
Yenni Kurniawati (Unknown)
Admi Salma (Unknown)



Article Info

Publish Date
24 Aug 2024

Abstract

This study aims to classify school dropout rates in West Sumatra Province using the Random Forest algorithm with the Synthetic Minority Oversampling Technique (SMOTE). Based on 2021 data from the Ministry of Education, Culture, Research, and Technology (Kemdikbudristek), the dropout rate in West Sumatra is above the national average. Despite efforts to reduce dropout rates, results remain suboptimal. Therefore, this study seeks to identify the causes of student dropouts and compare the performance of the Random Forest algorithm with and without SMOTE. The study uses the 2021 dropout data from West Sumatra, which has a significant class imbalance. SMOTE is applied to balance the data. The dataset is split into training and testing sets in an 80%:20% ratio, and parameter tuning is performed to optimize mtry and the number of trees (ntree). The model is evaluated using a confusion matrix to compare performance. The results show that Random Forest with SMOTE outperforms the version without SMOTE, with improvements in precision, recall, and F1-score. The presence of the biological mother ( ) is identified as the most significant factor influencing student dropouts, based on the Mean Decrease Gini value. The study concludes that using SMOTE in the Random Forest algorithm helps reduce classification bias and enhances the model's ability to detect students at risk of dropping out.

Copyrights © 2024






Journal Info

Abbrev

ujsds

Publisher

Subject

Computer Science & IT Decision Sciences, Operations Research & Management Mathematics Social Sciences

Description

UNP Journal of Statistics and Data Science is an open access journal (e-journal) launched in 2022 by Department of Statistics, Faculty of Science and Mathematics, Universitas Negeri Padang. UJSDS publishes scientific articles on various aspects related to Statistics, Data Science, and its ...