Journal of Big Data Analytic and Artificial Intelligence
Vol 8 No 1 (2025): JBIDAI Juni 2025

Optimalisasi Pengelompokkan Konsumen dengan Multi Internal Metric Validation dan Boxplot Analysis

Fitriyanto, Rachmad (Unknown)
Nurindah, Nurindah (Unknown)



Article Info

Publish Date
30 Jun 2025

Abstract

The simultaneous use of multiple internal validation metrics to determine the optimal number of clusters in K-Means Clustering often results in differing K values, which can confuse data practitioners when extracting insights, such as identifying customer characteristics. This study aims to develop an evaluation framework to address the ambiguity arising from varying K values produced by different internal validation metrics. The proposed K evaluation framework consists of two stages. In the first stage, five internal validation metrics—Davies-Bouldin Index (DBI), Silhouette Score, Elbow Method, Dunn Index, and Calinski-Harabasz Index—are used as filters to generate up to five top K candidates. The second stage involves boxplot analysis, interquartile range (IQR), and elbow visualization to explore the cohesiveness and stability of the resulting clusters. The first-stage evaluation yielded four potential cluster counts: K = 2, 5, 7, and 10. In the second stage, based on the elbow graph of the average interquartile range, K = 5 was identified as the most optimal number of clusters compared to the other candidates. These results indicate that using a larger number of internal validation metrics may increase the likelihood of producing multiple K values. However, a higher number of clusters does not necessarily guarantee better quality. The implications of this research highlight the importance of a layered evaluation approach in determining the optimal number of clusters, especially when employing multiple internal validation metrics. The proposed framework can assist data practitioners in making more informed decisions and reducing ambiguity in the clustering process. In the future, this framework can be extended by incorporating external validation metrics or adapted to other clustering algorithms.

Copyrights © 2025






Journal Info

Abbrev

JBIDAI

Publisher

Subject

Computer Science & IT

Description

JBIDAI adalah jurnal nasional berbahasa Indonesia versi online yang dikelola oleh Prodi Sistem Informasi STMIK PPKIA Tarakanita Rahmawati. Jurnal ini memuat hasil-hasil penelitian dengan cakupan fokus penelitian meliputi : Artificial Intelligence, Big Data, Data Mining, Information Retrieval, ...