Journal of Applied Data Sciences
Vol 6, No 4: December 2025

Unsupervised Neural Networks for Breast Cancer Clustering: A Comparative Study of RBMs and SOMs with Interpretability Metrics

Soundes, Mekki (Unknown)
Ahlam, Labdaoui (Unknown)



Article Info

Publish Date
02 Sep 2025

Abstract

This study presents a comparative analysis of two unsupervised neural network models—Restricted Boltzmann Machines (RBMs) and Self-Organizing Maps (SOMs)—applied to breast cancer data clustering. The primary objective is to evaluate and benchmark these models in terms of their latent feature extraction, clustering accuracy, and interpretability in a medical diagnostic context. Using a preprocessed breast cancer dataset comprising 569 patient records and 30 clinical features, the models were trained and evaluated based on two internal clustering metrics: Silhouette Score and Davies-Bouldin Index (DBI). The proposed methodology, implemented in Python, emphasizes reproducibility and diagnostic relevance. RBMs achieved a Silhouette Score of 0.88 and a DBI of 0.52, indicating compact and well-separated clusters, while SOMs recorded significantly lower performance with a Silhouette Score of 0.34 and a DBI of 1.47. Furthermore, classification performance (based on cluster-label mapping) shows RBMs yielding precision between 0.82 and 0.92, and recall between 0.87 and 0.89 for benign and malignant cases. SOMs, although less accurate, offer superior visualization of high-dimensional data, which aids in exploratory analysis and interpretability. The key contribution of this work lies in the development of a standardized evaluation framework for unsupervised neural clustering in healthcare, combining quantitative clustering metrics with qualitative insights into clinical applicability. The findings demonstrate that RBMs are better suited for diagnostic tasks requiring high pattern recognition, whereas SOMs retain value for data exploration and decision explanation. This research introduces a novel integration of RBM-based clustering into medical analytics, highlighting its potential in supporting decision-making processes in oncology. Future work will extend this approach to hybrid models and multi-modal datasets, aiming to balance performance and explainability in complex diagnostic environments.

Copyrights © 2025






Journal Info

Abbrev

JADS

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes ...