Indonesian Journal of Artificial Intelligence and Data Mining
Vol 8, No 1 (2025): March 2025

Enhancing Single Nucleotide Polymorphisms Detection from Imbalanced Data: A Study of Resampling Techniques in Machine Learning Algorithms

Nurhasanah, Rossy (Unknown)
Arisandi, Dedy (Unknown)
Purnamasari, Fanindia (Unknown)
Hayatunnufus, Hayatunnufus (Unknown)
Simangunsong, Daisy Sere Damara (Unknown)
Pulungan, Aflah Mutsanni (Unknown)



Article Info

Publish Date
18 Mar 2025

Abstract

Identifying the actual Single Nucleotide Polymorphisms (SNPs) by sourcing Next Generation Sequencing (NGS) data emerges an imbalanced problem due to the inherent high error rate of NGS technology. The imbalance problem has been found to have a negative impact on machine learning algorithms because it produces biased models and poor performance, particularly in detecting actual SNP that belong to the underrepresented class in question.   This study evaluates the effectiveness of several resampling techniques, including Borderline-SMOTE, Random Undersampling, and Tomek-Link, in enhancing the performance of machine learning algorithms, specifically Random Forest (RF) and Artificial Neural Networks (ANN). Furthermore, we compare these techniques to determine the most effective approach. Our results indicate that Borderline-SMOTE improves the F-Measure of RF from 69.72 to 91.52 (a 31.2% increase) and ANN from 79.75 to 91.32 (a 14.5% increase) and outperforms other resampling methods. These findings highlight the crucial role of resampling techniques and the careful selection of algorithms in improving classification accuracy for imbalanced datasets.

Copyrights © 2025






Journal Info

Abbrev

IJAIDM

Publisher

Subject

Computer Science & IT

Description

Indonesian Journal of Artificial Intelligence and Data Mining (IJAIDM) is an electronic periodical publication published by Puzzle Research Data Technology (Predatech) Faculty of Science and Technology UIN Sultan Syarif Kasim Riau, Indonesia. IJAIDM provides online media to publish scientific ...