This research investigates how the values of clustered datasets, both normalized and non-normalized, influence the computation of Euclidean distance in the K-means algorithm. Additionally, it examines the impact of varying cluster quantities, identified through the elbow method, on the evaluation of the Davies-Bouldin Index (DBI). A dataset comprising 174 records undergoes mining using the CRISP-DM (Cross-Industry Standard Process for Data Mining) approach. In the data preparation phase, the min-max algorithm is applied to ensure that attribute values within the dataset are not diminished relative to each other. Concerning the selection of an optimal K value, the elbow method is employed. In this investigation, two K values exhibit significant mean reduction: the fourth and third cluster quantities. The DBI results for 3 clusters show a smaller value of 0.9250 compared to the DBI result for 4 clusters, which is 1.1584. The fundamental principle of evaluating the Davies-Bouldin Index is that a smaller DBI value (approaching zero but not reaching the minimum) indicates a better cluster. These findings contribute to a better understanding of the evaluation techniques involving the elbow method and Davies-Bouldin Index in clustering analysis and offer insights into the relationship between determining cluster quantities and clustering performance.
Copyrights © 2023