International Journal of Engineering, Science and Information Technology
Vol 5, No 3 (2025)

Hybrid Deep Fixed K-Means (HDF-KMeans)

Zuhanda, Muhammad Khahfi (Unknown)
Kohsasih, Kelvin Leonardi (Unknown)
Octaviandy, Pieter (Unknown)
Hartono, Hartono (Unknown)
Kurnia, Dian (Unknown)
Tarigan, Nurliana (Unknown)
Ginting, Manan (Unknown)
Hutagalung, Manahan (Unknown)



Article Info

Publish Date
23 May 2025

Abstract

K-Means is one of the most widely used clustering algorithms due to its simplicity, scalability, and computational efficiency. However, its practical application is often hindered by several well-known limitations, such as high sensitivity to initial centroid selection, inconsistency across different runs, and suboptimal performance when dealing with high-dimensional or non-linearly separable data. This study introduces a hybrid clustering algorithm named Hybrid Deep Fixed K-Means (HDF-KMeans) to address these issues. This approach combines the advantages of two state-of-the-art techniques: Deep K-Means++ and Fixed Centered K-Means. Deep K-Means++ leverages deep learning-based feature extraction to transform raw data into more meaningful representations while employing advanced centroid initialization to enhance clustering accuracy and adaptability to complex data structures. Complementarily, Centered K-Means improve the stability of clustering results by locking certain centroids based on domain knowledge or adaptive strategies, effectively reducing variability and convergence inconsistency. Integrating these two methods results in a robust hybrid model that delivers improved accuracy and consistency in clustering performance. The proposed HDF-KMeans algorithm is evaluated using five benchmark medical datasets: Breast Cancer, COVID-19, Diabetes, Heart Disease, and Thyroid. Performance is assessed using standard classification metrics: Accuracy, Precision, Recall, and F1-Score. The results show that HDF-KMeans outperforms traditional K-Means, K-Means++, and K-Means-SMOTE algorithms across all datasets, excelling in overall accuracy and F1 Score. While some trade-offs are observed in specific precision or recall metrics, the model maintains a solid balance, demonstrating reliability. This study highlights HDF-KMeans as a promising and effective solution for complex clustering tasks, particularly in high-stakes domains like healthcare and biomedical analysis.

Copyrights © 2025