Jambura Journal of Electrical and Electronics Engineering
Vol 7, No 2 (2025): Juli - Desember 2025

Optimization of K-Means Attribute Selection Using Correlation Matrix in Patient Disease Clustering

Bengnga, Amiruddin (Unknown)
Ishak, Rezqiwati (Unknown)



Article Info

Publish Date
13 Jul 2025

Abstract

Patient health is a critical element in public health systems, where grouping disease data can facilitate risk identification and more efficient treatment planning. However,  conventional clustering methods  such as K-Means often have difficulty in separating clusters optimally, especially when the attributes used are irrelevant or redundant. This study aims to optimize  the clustering process  of patient health data by applying attribute selection using Correlation Matrix and Heatmap in the K-Means algorithm. The method used involves normalizing the data with a StandardScaler and determining the optimal number of clusters through  the Elbow Method, which results in three  optimal clusters. Attribute selection is carried out to reduce redundancy, leaving important features such as age, height, and body mass index (BMI). The results of the analysis showed that attribute selection significantly improved clustering performance, with the Silhouette Score increasing from 0.20 to 0.54 and  the Davies-Bouldin Index (DBI) decreasing from 1.60 to 0.63. Visualization of clustering results  using Principal Component Analysis (PCA) shows a clearer separation between clusters, reflecting different patient characteristics. These findings confirm the importance of attribute selection in  the clustering process  to achieve more optimal results that can help in understanding patient health patterns and designing more appropriate interventions.Kesehatan pasien merupakan elemen penting dalam sistem kesehatan masyarakat, di mana pengelompokan data penyakit dapat memfasilitasi identifikasi risiko dan perencanaan perawatan yang lebih efisien. Namun metode clustering konvensional seperti K-Means sering mengalami kesulitan dalam memisahkan cluster secara optimal, terutama ketika atribut yang digunakan tidak relevan atau berlebihan. Penelitian ini bertujuan untuk mengoptimalkan proses clustering data kesehatan pasien dengan menerapkan seleksi atribut menggunakan Correlation Matrix dan Heatmap dalam algoritma K-Means. Metode yang digunakan melibatkan normalisasi data dengan StandardScaler dan penentuan jumlah cluster optimal melalui Elbow Method, yang menghasilkan tiga cluster optimal. Seleksi atribut dilakukan untuk mengurangi redundansi, menyisakan fitur-fitur penting seperti umur, tinggi badan, dan indeks massa tubuh (IMT). Hasil analisis menunjukkan bahwa seleksi atribut secara signifikan meningkatkan performa clustering, dengan Silhouette Score meningkat dari 0,20 menjadi 0,54 dan Davies-Bouldin Index (DBI) menurun dari 1,60 menjadi 0,63. Visualisasi hasil clustering menggunakan Principal Component Analysis (PCA) menunjukkan pemisahan yang lebih jelas antar cluster, mencerminkan karakteristik pasien yang berbeda. Temuan ini menegaskan pentingnya seleksi atribut dalam proses clustering untuk mencapai hasil yang lebih optimal yang dapat membantu dalam memahami pola kesehatan pasien dan merancang intervensi yang lebih tepat.  

Copyrights © 2025






Journal Info

Abbrev

jjeee

Publisher

Subject

Computer Science & IT Control & Systems Engineering Electrical & Electronics Engineering Energy Engineering

Description

Jambura Journal of Electrical and Electronics Engineering (JJEEE) is a peer-reviewed journal published by Electrical Engineering Department Faculty of Engineering, State University of Gorontalo. JJEEE provides open access to the principle that research published in this journal is freely available ...