Bouazza, Sara Haddou
Unknown Affiliation

Published : 2 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 2 Documents
Search

Revolutionizing cancer classification: the snr-ogscc method for improved gene selection and clustering Bouazza, Sara Haddou; Bouazza, Jihad Haddou
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 14, No 1: February 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v14.i1.pp466-472

Abstract

This study presents the signal-to-noise ratio optimized gene selection and clustering for cancer classification (SNR-OGSCC) methodology, aimed at enhancing classification accuracy while reducing the dimensionality of gene expression data across various cancer types. Implemented on a standard computational setup, the SNR-OGSCC method combines advanced filtering, clustering, and machine learning techniques, demonstrating significant improvements in classification accuracy on seven cancer datasets: leukemia, colon cancer, prostate cancer, lung cancer, lymphoma, central nervous system (CNS) tumors, and ovarian cancer. Notably, our approach achieved perfect accuracies of 100% for leukemia, lung cancer, and ovarian cancer, with high accuracies of 98.4% for colon cancer, 99.1% for prostate cancer, 98.3% for lymphoma, and 99.7% for CNS tumors, while requiring as few as 4–5 genes for effective classification. These findings highlight the efficiency and robustness of the SNR-OGSCC methodology, suggesting its potential to identify meaningful biomarkers and improve personalized cancer treatment strategies. Further validation with larger datasets and biological experiments is essential to confirm its applicability in clinical settings.
Deep learning-based feature selection for lung adenocarcinoma classification and biomarker discovery Bouazza, Sara Haddou; Bouazza, Jihad Haddou
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 14, No 6: December 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v14.i6.pp4703-4710

Abstract

Lung adenocarcinoma, a leading cause of cancer-related mortality, underscores the need for reliable diagnostic tools. This study proposes a robust multi-stage feature selection and classification framework for biomarker discovery, using the cancer genome atlas lung adenocarcinoma (TCGA-LUAD) as the primary dataset and GSE19188 for independent validation. The framework combines differential expression analysis (Wilcoxon rank-sum test), joint mutual information maximization (JMIM), and sparse autoencoder-based refinement to identify a compact and predictive set of five genes. These genes are involved in key lung cancer pathways, including epidermal growth factor receptor (EGFR) signaling, cell cycle regulation, and immune response, and include biomarkers such as surfactant protein A2 (SFTPA2), napsin an aspartic peptidase (NAPSA), and T-box transcription factor 4 (TBX4). The hybrid deep learning classifier achieved high accuracy (98.4%) and area under the receiver operating characteristic curve (AUC-ROC) (0.996) on TCGA-LUAD, with strong generalization on GSE19188 (accuracy: 96.7%, AUC-ROC: 0.993%). Overall, the framework offers an interpretable and effective solution for LUAD classification and biomarker identification.