Garuda - Garba Rujukan Digital

Article Per Year (5 Year)

p-Index From 2021 - 2026

0.23

P-Index

This Author published in this journals

All Journal BAREKENG: Jurnal Ilmu Matematika dan Terapan

Indahwati Indahwati

Department of Statistics and Data Science, IPB University, Indonesia

Author-ID : 9849308

Computer Science & IT Control & Systems Engineering Economics, Econometrics & Finance Energy Engineering Mathematics Mechanical Engineering Physics Transportation

Published : 1 Documents Claim Missing Document

Claim Missing Document

Articles

PERFORMANCE ANALYSIS OF MODIFIED-ODBOT AND SMOTE FOR TREE-BASED CLASSIFICATION OF IMBALANCED HUMAN DEVELOPMENT INDEX DATA Yunna Mentari Indah; Anwar Fitrianto; Indahwati Indahwati
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 20 No 3 (2026): BAREKENG: Journal of Mathematics and Its Application
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30598/barekengvol20iss3pp2311-2326

Classification of Human Development Index (HDI) data presents significant challenges due to severe class imbalance, where low-development regions are substantially underrepresented. This imbalance reduces classification performance because machine learning models tend to be biased toward the majority classes, making it challenging to accurately identify minority classes. This study proposes a modified ODBOT that replaces Euclidean distance with Mahalanobis distance within the oversampling mechanism (Mahalanobis-based ODBOT) and compares its performance with Euclidean-based ODBOT with and without Principal Component Analysis (PCA), as well as the conventional SMOTE technique. Four tree-based classifications were used, namely Random Forest, Double Random Forest, XGBoost, and LightGBM. The Human Development Index (HDI) data set from the Central Statistics Agency, consisting of 514 observations and four features, with an imbalance ratio (IR) of 19.0, was divided into training and testing sets (ratio 80:20) with 30 repetitions and evaluated using F1-Measure (F1-M), Geometric Mean (G-M), Area Under the Curve (AUC), and computation time. The results show that Mahalanobis-based ODBOT achieved the highest performance on the AUC evaluation metric across all classification models and the highest on the G-M evaluation metric in three of the four classification models, but required significantly longer computation time (2545.66 seconds). In contrast, the Euclidean-based ODBOT with PCA improved F1-M while reducing computation time (7.21 seconds) compared to the original ODBOT (68.23 seconds), while SMOTE consistently improved G-M and AUC across all experiments. These findings suggest that oversampling techniques should be selected based on practical application needs. Specifically, the Mahalanobis-based ODBOT can be recommended when improving prediction performance is a priority, while the Euclidean-based ODBOT with PCA or SMOTE is preferable for real-world implementations that require faster execution and lower computational cost.

Co-Authors Anwar Fitrianto Yunna Mentari Indah

Title

Found 1 Documents
Search

Abstract

Title Search

Found 1 Documents Search

Abstract

Title

Found 1 Documents
Search