Garuda - Garba Rujukan Digital

Vol 10 No 1 (2024): January

Pratama, Irfan (Unknown)
Prasetyaningrum, Putri Taqwa (Unknown)
Chandra, Albert Yakobus (Unknown)
Suria, Ozzi (Unknown)

Publish Date
25 Feb 2024

Imbalanced data refers to a condition that there is a different size of samples between one class with another class(es). It made the term “majority” class that represents the class with more instances number on the dataset and “minority” classes that represent the class with fewer instances number on the dataset. Under the target of educational data mining which demands accurate measurement of the student’s performance analysis, data mining requires an appropriate dataset to produce good accuracy. This study aims to measure the resampling method’s performance through the classification process on the student’s performance dataset, which is also a multi-class dataset. Thus, this study also measures how the method performs on a multi-class classification problem. Utilizing four public educational datasets, which consist of the result of an educational process, this study aims to get a better picture of which resampling methods are suitable for that kind of dataset. This research uses more than twenty resampling methods from the SMOTE variants library. as a comparison; this study implements nine classification methods to measure the performance of the resampled data with the non-resampled data. According to the results, SMOTE-ENN is generally the better resampling method since it produces a 0,97 F1 score under the Stacking classification method and the highest among others. However, the resampling method performs relatively low on the dataset with wider label variations. The future work of this study is to dig deeper into why the resampling method cannot handle the enormous class variation since the F1 score on the student dataset is lower than the other dataset.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

Website

Abbrev

Publisher

Universitas Pesantren Tinggi Darul ulum

Subject

Computer Science & IT

Description

Register: Scientific Journals of Information System Technology is an international, peer-reviewed journal that publishes the latest research results in Information and Communication Technology (ICT). The journal covers a wide range of topics, including Enterprise Systems, Information Systems ...

Article Info

Abstract

Measuring Resampling Methods on Imbalanced Educational Dataset’s Classification Performance

Article Info

Abstract