Claim Missing Document
Check
Articles

Found 1 Documents
Search

Evaluation of Biclustering Imputation Methods for Glioblastoma Gene Expression Data Silalahi, Agatha; Titin Siswantining; Setia Pramana
Enthusiastic : International Journal of Applied Statistics and Data Science Volume 5 Issue 1, April 2025
Publisher : Universitas Islam Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.20885/enthusiastic.vol5.iss1.art7

Abstract

Glioblastoma is a highly aggressive primary brain tumor with a low survival rate. One of the main challenges in analyzing glioblastoma gene expression data is the presence of missing values, which can reduce biclustering accuracy and affect biological interpretation. This research compared six imputation methods k-nearest neighbors (KNN), mean imputation, singular value decomposition, nonnegative matrix factorization, soft impute, and autoencoderon the GSE4290 gene expression dataset with missing values ranging from 5% to 50%. An evaluation using root mean square error (RMSE), mean absolute error (MAE), and structural similarity index measure (SSIM) showed that soft impute provided the best performance at all levels of missing values, with RMSE of 0.0076, MAE of 0.0073, and perfect SSIM of 1.0000 at 50% missing values. Meanwhile, deep learning-based autoencoder experienced significant performance degradation at high missing values. These findings indicate that more complex models are not always superior, and regularization-based approaches like soft impute are more effective in preserving the biological structure of the data. The results of this research contribute to the optimization of imputation strategies to improve the accuracy of biclustering analysis in glioblastoma studies.