The selection of the right initial centroid greatly affects the quality of clustering results in the K-Means algorithm. This study proposes a new approach in determining the initial centroid by utilizing the global average and variance of data dimensions. The global average is used to represent the overall center position of the data, while the variance of dimensions provides information on the distribution of each feature. This method is tested using three-dimensional synthetic data (X, Y, Z) with 121 data, and compared with the random initialization approach. The results show that the global average and variance-based method produces more balanced clusters, lower Sum of Squared Error (SSE) values, and the highest Silhouette Score value (0.65), as well as faster convergence. Compared to two random initialization scenarios, this method is proven to be more stable in separating clusters based on the distribution of low, medium, and high values. This approach makes an important contribution to the development of a more consistent and effective K-Means initialization strategy, especially for low to medium-dimensional numerical datasets.
Copyrights © 2025