Variance : Journal of Statistics and Its Applications
Vol 7 No 1 (2025): VARIANCE: Journal of Statistics and Its Applications

CLUSTERING AND VISUALIZATION OF OLYMPIC ATHLETE DATA BASED ON PHYSICAL AND DISCIPLINARY ATTRIBUTES

Nisa, Hilwin (Unknown)
Chairunissa, Abela (Unknown)



Article Info

Publish Date
31 Jul 2025

Abstract

This study aims to identify hidden patterns in international athlete data through clustering and data visualization approaches. The goal is to group athletes based on physical characteristics and sports disciplines to uncover meaningful trends. Utilizing a dataset of over 200,000 entries from 1896 to 2016, the study applies K-Means, Agglomerative and DBSCAN clustering methods. Preprocessing steps include handling missing data, selecting relevant variables (Height, Weight, Age, Sex, Sport, and Medal), and data normalization. The Silhouette score for K-Means (0.273647136516163645), Agglomerative (0.26134664130023655), and DBSCAN (-0.23920792207945957) indicates suboptimal clustering with overlapping clusters. K-Means clustering performs slightly better among the three methods. The findings are visualized through cluster plots and an interactive map showing medal distribution. This study highlights the limitations of traditional clustering methods for large datasets and suggests future exploration with advanced techniques.

Copyrights © 2025






Journal Info

Abbrev

variance

Publisher

Subject

Mathematics

Description

Jurnal ini diterbitkan oleh Program Studi Statistik Fakultas Matematika dan Ilmu Pengetahuan Alam, Universitas Pattimura, Ambon. Jurnal ini diterbitkan 2 kali pada bulan Juni dan ...