In the growing digital era, big data clustering becomes a major challenge in data analysis, especially with the well-known K-Means Algorithm that has limitations in dealing with large-scale data. This study aims to optimize the K-Means Algorithm for big data clustering with a computational distribution approach, to improve clustering efficiency and accuracy. We use the computational distribution approach to process data in parallel across multiple computing nodes, optimize memory usage, develop an intelligent cluster center selection algorithm, and optimize communication between nodes. The implementation of this optimization method successfully improves the efficiency and accuracy of big data clustering, reduces execution time and memory consumption. The practical implications include better business decision making and more effective marketing strategies based on more precise customer data analysis.
Copyrights © 2023