CAUCHY: Jurnal Matematika Murni dan Aplikasi
Vol 9, No 2 (2024): CAUCHY: JURNAL MATEMATIKA MURNI DAN APLIKASI

Enhancing Binary Classification Performance in Biomedical Datasets: Regularized ELM with SMOTE and Quantile Transforms Focused on Breast Cancer Analysis

Brilliant Friezka Aina (Electrical Engineering, Faculty of Electrical Engineering, Telkom University)
Meta Kallista (CoE HUMIC, Computer Engineering, Faculty of Electrical Engineering, Telkom University)
Ig. Prasetya Dwi Wibawa (CoE STAR, Electrical Engineering, Faculty of Electrical Engineering, Telkom University)
Ginaldi Ari Nugroho (National Research and Innovation Agency)
Ivana Meiska (Computer Engineering, Faculty of Electrical Engineering, Telkom University)
Syifa Melinda Naf’an (Computer Engineering, Faculty of Electrical Engineering, Telkom University)



Article Info

Publish Date
01 Nov 2024

Abstract

Using microarray datasets, this research investigation addresses the problem of unbalanced data in binary classification tasks. The objective is to increase classification performance by adding Extreme Learning Machine (ELM) regularization, as well as Synthetic Minority Over-sampling Technique (SMOTE) for data over-sampling and Quantile Transformer for data scaling. The study began with gathering important biological datasets from reputable sources such as UCI and Kaggle, including Pima Indian Diabetes, Heart Disease, and Wisconsin Breast Cancer. SMOTE was employed to solve the difficulty of data imbalance in the preparation of the dataset. The data was then separated into training (80%) and testing (20%) sets before being scaled using Quantile Transformation. To boost accuracy, ELMs were employed with an emphasis on introducing regularization techniques. Quantile Transforms are used to generate a Gaussian or uniform probability distribution from numerical input variables. Regularized ELM (R-ELM) surpasses ELM in terms of AUC, despite ELM's faster calculation time. The final selection of the regularization parameter (C) in R-ELM influences the model's performance and calculation time. Overall, R-ELM with SMOTE produces encouraging results when it comes to effectively categorizing biological dataset properties. A subsequent investigation and validation of additional datasets, however, are necessary to establish its generalizability and robustness.

Copyrights © 2024






Journal Info

Abbrev

Math

Publisher

Subject

Mathematics

Description

Jurnal CAUCHY secara berkala terbit dua (2) kali dalam setahun. Redaksi menerima tulisan ilmiah hasil penelitian, kajian kepustakaan, analisis dan pemecahan permasalahan di bidang Matematika (Aljabar, Analisis, Statistika, Komputasi, dan Terapan). Naskah yang diterima akan dikilas (review) oleh ...