This study explores the effectiveness of K-Means clustering for segmenting mall customers based on demographic and behavioral features, using the Mall Customers dataset. The segmentation process focuses on three numerical attributes—age, annual income, and spending score—with an additional engineered feature: the spending-to-income ratio. After applying min-max normalization and log transformation, the Elbow Method was employed to determine the optimal number of clusters ($K=5$). The resulting clusters were evaluated using internal validation metrics, including Silhouette Score, Davies-Bouldin Index, and Calinski-Harabasz Index. K-Means clustering achieved the best overall performance compared to Gaussian Mixture Models (GMM), DBSCAN, and Agglomerative Hierarchical Clustering. Five interpretable customer profiles emerged, ranging from high-spending young professionals to low-engagement senior customers. These clusters were visualized using PCA for dimensionality reduction and further interpreted through descriptive statistics and domain-based labeling. Business implications were derived by aligning each cluster with strategic marketing recommendations. Overall, the findings reaffirm the utility of classical clustering frameworks such as K-Means—when rigorously validated and thoughtfully interpreted—for deriving actionable insights in customer analytics.
Copyrights © 2025