Engineering and Technology International Journal (EATIJ)
Vol 8 No 01 (2026): Engineering and Technology International Journal (EATIJ)

An Adaptive Hybrid Clustering Framework Integrating K-Means and Differential Evolution for High-Dimensional Data Analysis




Article Info

Publish Date
19 Feb 2026

Abstract

Clustering high-dimensional data remains a foundational yet persistently challenging problem in unsupervised machine learning, primarily because the performance of centroid-based methods such as K-Means degrades sharply in high-dimensional spaces due to local optima sensitivity and the curse of dimensionality. This paper proposes an Adaptive Hybrid Clustering Framework (AHCF) that integrates K-Means with Differential Evolution (DE) optimisation to systematically overcome K-Means's dependence on initial centroid placement in high-dimensional settings. The proposed framework introduces three novel components: (1) an adaptive mutation factor (F) governed by a monotonically decreasing annealing schedule that transitions from broad global exploration (F=0.90) to fine local exploitation (F=0.40) across generations; (2) an adaptive crossover probability (CR) that increases linearly from 0.50 to 0.90, progressively favouring population diversity as the search converges; and (3) a centroid refinement step that projects each DE trial solution back to the cluster mean, ensuring geometrically valid centroid positions throughout the evolutionary search. Experiments on a synthetically generated high-dimensional dataset (n=1,500, d=32, k=5) demonstrate that AHCF achieves a Silhouette Score of 0.6127, Davies-Bouldin Index of 0.5023, and Calinski-Harabasz Index of 2834.6 — improvements of 2.7%, 7.2%, and 6.9% respectively over the strong K-Means baseline (n_init=20). The proposed adaptive mechanism delivers a 75.2% reduction in Within-Cluster Sum of Squares (from 22,516 to 5,592) and achieves faster convergence compared to a static parameter equivalent. These results establish AHCF as a robust, theoretically grounded, and practically deployable framework for high-dimensional clustering tasks in data mining and machine learning applications.

Copyrights © 2026