Claim Missing Document
Check
Articles

Found 1 Documents
Search

A Comparative Study of Multi-Class Classification Based on Imbalanced Data: A Review Abdulkareem, Rojan; Abdulazeez , Adnan Mohsin
The Indonesian Journal of Computer Science Vol. 14 No. 5 (2025): The Indonesian Journal of Computer Science
Publisher : AI Society & STMIK Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33022/ijcs.v14i5.5020

Abstract

Classification of unbalanced multiclass datasets is still a major challenge in machine learning in many fields of applications, including medical diagnostics, fraud detection, and picture classification, where minority classes are the most crucial, but at the same time under-represented. Classical classification algorithms designed for balanced data tend to overfit the majority classes deeming a large number of minority classes misclassified and, as a result, compromising the model's performance. This review covers the main state-of-the-art techniques for class imbalance problems including under-sampling and over-sampling techniques, ensemble approaches, cost-sensitive learning, and producing synthetic data via SMOTE (synthetic minority oversampling technique). Recently, GANs (Generative Adversarial Networks) have also been employed to generate synthetic data, specifically valuable for complex datasets where realistic data augmentation is needed. Each of these techniques is analyzed in terms of their capability of dealing with imbalanced data through conventional metrics such as accuracy and specific metrics for imbalanced datasets such as F1-score, G-mean, and others. Recent advancements, such as hybrid approaches and learning from deep learning models are also discussed as viable solutions given the complexities associated with big data (high dimensional and large) and their corresponding models. Such comparative analysis should facilitate the construction of more robust models that handle complex data in modern applications.