Claim Missing Document
Check
Articles

Found 1 Documents
Search

CLUSTER ANALYSIS OF MULTIVARIATE PANEL DATA ON DATA CONTAINING OUTLIERS Kapiluka, Kristuisno Martsuyanto; Wijayanto, Hari; Fitrianto, Anwar
BAREKENG: Jurnal Ilmu Matematika dan Terapan Vol 20 No 1 (2026): BAREKENG: Journal of Mathematics and Its Application
Publisher : PATTIMURA UNIVERSITY

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30598/barekengvol20iss1pp0439-0452

Abstract

One clustering method for panel data is K-Means Longitudinal (KML), which considers only a single trajectory per subject over time. To address this limitation, KML was extended into K-Means Longitudinal 3D (KML3D), which enables clustering of joint or multivariate longitudinal data by considering multiple trajectories measured simultaneously for each subject. Both KML and KML3D provide new insights into clustering panel data using a non-hierarchical K-means approach. Hereinafter, this method is referred to as KML3D K-Means. KML3D K-Means implements the K-Means algorithm, specifically designed to cluster trajectories in panel data, and uses the mean as the clustering centroid. In practice, the K-Means algorithm is less effective in clustering data with outliers. This issue can be overcome by KML3D K-Medoids, a method based on KML3D that uses the median as the centroid. This study aims to determine cluster analysis for multivariate panel data on data containing outliers with KML3D K-Means and KML3D K-Medoids. Both methods are applied to panel data of social and welfare statistical data from 34 provinces observed for 8 years (2016 – 2023). The comparison of methods is based on the Calinski–Harabasz index. The results of the study show that KML3D K-Medoids has a Calinski-Harabasz index that is higher than KML3D K-Means in clustering multivariate panel data with outliers. The analysis identified three optimal clusters (k = 3) based on the Calinski–Harabasz (CH) index, which can be categorized as the “more prosperous”, “moderately prosperous”, and “less prosperous” groups. The growth rate analysis reveals disparities in development trajectories across clusters, with cluster 3 showing the most consistent improvements, cluster 1 moderate progress, and cluster 2 lagging in key social and welfare indicators.