Jurnal Statistika Universitas Muhammadiyah Semarang
Vol 11, No 2 (2023): Jurnal Statistika Universitas Muhammadiyah Semarang

K-Means Algorithm for Grouping Provinces in Indonesia Based on Macroeconomic and Criminality Indicators

Andrea Tri Rian Dani (Statistics Study Program, Department of Mathematics, Faculty of Mathematics and Natural Sciences (FMIPA), Mulawarman University, Samarinda)
Fachrian Bimantoro Putra (Statistics Study Program, Department of Mathematics, Faculty of Mathematics and Natural Sciences (FMIPA), Mulawarman University, Samarinda)
Meirinda Fauziyah (Statistics Study Program, Department of Mathematics, Faculty of Mathematics and Natural Sciences (FMIPA), Mulawarman University, Samarinda)
Sifriyani Sifriyani (Statistics Study Program, Department of Mathematics, Faculty of Mathematics and Natural Sciences (FMIPA), Mulawarman University, Samarinda)
Suyitno Suyitno (Statistics Study Program, Department of Mathematics, Faculty of Mathematics and Natural Sciences (FMIPA), Mulawarman University, Samarinda)
M Fathurahman (Statistics Study Program, Department of Mathematics, Faculty of Mathematics and Natural Sciences (FMIPA), Mulawarman University, Samarinda)



Article Info

Publish Date
30 Nov 2023

Abstract

Cluster analysis is a method in multivariate analysis to group n observations into K groups (K ≤ n) based on their characteristics. One of the well-known algorithms in cluster analysis is K-Means. K-Means uses the non-hierarchical principle where at the initial initiation, it is necessary to determine the number of groups in advance. The K-Means algorithm can be applied to classify provinces in Indonesia based on macroeconomic indicators (percentage of poor people, open unemployment rate, and Gini ratio) and crime rate (Crime rate). The ultimate goal of this research is of course to get optimal grouping results. The similarity measure used is Euclidean Distance. The number of groups tested K=2,3,4,…,10 and the optimal number of groups with the highest Silhouette value was selected. Based on the results of the analysis, the optimal number of clusters is four. These four clusters have characteristics that distinguish one cluster from another.

Copyrights © 2023






Journal Info

Abbrev

statistik

Publisher

Subject

Decision Sciences, Operations Research & Management

Description

Focus and Scope a. Statistika Teori, Statistika Komputasi, Statistika terapan b. Matematika Teori dan Aplikasi c. Design of ...