Brilliant Friezka Aina
Electrical Engineering, Faculty of Electrical Engineering, Telkom University

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Enhancing Binary Classification Performance in Biomedical Datasets: Regularized ELM with SMOTE and Quantile Transforms Focused on Breast Cancer Analysis Brilliant Friezka Aina; Meta Kallista; Ig. Prasetya Dwi Wibawa; Ginaldi Ari Nugroho; Ivana Meiska; Syifa Melinda Naf’an
CAUCHY: Jurnal Matematika Murni dan Aplikasi Vol 9, No 2 (2024): CAUCHY: JURNAL MATEMATIKA MURNI DAN APLIKASI
Publisher : Mathematics Department, Universitas Islam Negeri Maulana Malik Ibrahim Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.18860/ca.v9i2.28785

Abstract

Using microarray datasets, this research investigation addresses the problem of unbalanced data in binary classification tasks. The objective is to increase classification performance by adding Extreme Learning Machine (ELM) regularization, as well as Synthetic Minority Over-sampling Technique (SMOTE) for data over-sampling and Quantile Transformer for data scaling. The study began with gathering important biological datasets from reputable sources such as UCI and Kaggle, including Pima Indian Diabetes, Heart Disease, and Wisconsin Breast Cancer. SMOTE was employed to solve the difficulty of data imbalance in the preparation of the dataset. The data was then separated into training (80%) and testing (20%) sets before being scaled using Quantile Transformation. To boost accuracy, ELMs were employed with an emphasis on introducing regularization techniques. Quantile Transforms are used to generate a Gaussian or uniform probability distribution from numerical input variables. Regularized ELM (R-ELM) surpasses ELM in terms of AUC, despite ELM's faster calculation time. The final selection of the regularization parameter (C) in R-ELM influences the model's performance and calculation time. Overall, R-ELM with SMOTE produces encouraging results when it comes to effectively categorizing biological dataset properties. A subsequent investigation and validation of additional datasets, however, are necessary to establish its generalizability and robustness.