Digital transformation in the health system demands the use of clinical data more strategically to support evidence-based decision-making. This study aims to explore the application of the K-Means Clustering algorithm in patient segmentation based on a combination of vital signs (systolic and diastolic blood pressure) and demographic characteristics (age, weight, and gender). Data on 1,401 outpatients was obtained from the medical record system of a hospital in Indonesia, then processed through the stages of preprocessing, standardization, and dimensionality reduction using PCA. The results of the elbow method showed that the optimal number of clusters was 3 (k=3). Descriptive analysis showed that Cluster 0 consisted of 100% women with normal blood pressure (124/77 mmHg) and an average body weight of 55.6 kg; Cluster 1 consists of the majority of women with high blood pressure (160.8/98.8 mmHg); while Cluster 2 includes 100% of men with blood pressure leading to pre-hypertension (130.1/80.7 mmHg). PCA visualizations show fairly clear cluster separation, with Cluster 1 having the most clinically distinct characteristics. The conclusion of this study is that the K-Means-based unsupervised learning approach is effective in identifying latent risk patterns in patient populations, as well as the potential to support clinical risk mapping and preventive health policies. Future recommendations include the integration of this method in EMR systems and the expansion of studies on national datasets.
Copyrights © 2025