Density-based clustering is usually more effective when processing data of different densities. This method is pioneered by the Density-based Applied Noise Spatial Clustering (DBSCAN) algorithm. There is a significant difference in behavior between k-Means and DBSCAN, which is processing data that contains noise. To this end, this research studies the impact of dimensionality reduction on high-dimensional data on the clustering results of the k-Means algorithm represented by the centroid method and the clustering results of the DBSCAN algorithm represented by the density method. Although the quality of the clustering results on k-Means has been improved after the numerical reduction by Singular Value Decomposition (SVD), from the initial average distance of 1.04136 to 0.003, the statistical change is not significant or considered to be the same. Therefore, it can be concluded statistically that SVD has no effect on the quality of k-Means clustering results. On the other hand, in DBSCAN, the effect of SVD dimensionality reduction is very significant. It can change the quality of the clustering results from the initial average intra-cluster distance of 76.13480 to 13.71130 or improve the quality by 555.27%. The significant impact of SVD on SVD + k-Means optimization and SVD + DBSCAN optimization cluster calculation time changes is also shown. SVD optimization can accelerate k-Means calculation time from 3.68182 seconds to 2,09091 seconds or 1.76 times. At the same time, SVD optimization accelerates the DBSCAN calculation time from 19.40000 seconds to 0.97500 seconds or 19.89 times.
Copyrights © 2021