Sinkron : Jurnal dan Penelitian Teknik Informatika
Vol. 9 No. 1 (2025): Research Article, January 2025

A Comparative Analysis of Clustering Algorithms for Expedia’s Travel Dataset

Airlangga, Gregorius (Unknown)



Article Info

Publish Date
09 Feb 2025

Abstract

The effective segmentation of travel data is crucial for deriving actionable insights in the tourism and hospitality sectors. This study conducts a comprehensive evaluation of four clustering algorithms Agglomerative Clustering, DBSCAN, Gaussian Mixture Models (GMM), and KMeans on a travel dataset, using three widely recognized metrics: Silhouette Score, Davies-Bouldin Index, and Calinski-Harabasz Score. The dataset was preprocessed through standardization and dimensionality reduction via Principal Component Analysis (PCA) to facilitate visualization and ensure computational efficiency. The results highlight significant differences in the performance of these algorithms. Agglomerative Clustering achieved the highest Silhouette Score, indicating superior cluster cohesion and separation, while KMeans recorded the highest Calinski-Harabasz Score, demonstrating strong inter-cluster variance. In contrast, DBSCAN performed poorly, producing low scores across all metrics, attributed to sensitivity to parameter selection and density irregularities in the dataset. Gaussian Mixture Models exhibited moderate performance but struggled with overlapping clusters due to limitations in modeling non-Gaussian data distributions. Visualization of clustering results confirmed these findings, revealing compact clusters for Agglomerative and KMeans, while DBSCAN and GMM showed less defined structures. This study underscores the importance of selecting clustering algorithms based on dataset characteristics and analysis objectives

Copyrights © 2025






Journal Info

Abbrev

sinkron

Publisher

Subject

Computer Science & IT

Description

Scope of SinkrOns Scientific Discussion 1. Machine Learning 2. Cryptography 3. Steganography 4. Digital Image Processing 5. Networking 6. Security 7. Algorithm and Programming 8. Computer Vision 9. Troubleshooting 10. Internet and E-Commerce 11. Artificial Intelligence 12. Data Mining 13. Artificial ...