This study aims to implement an unsupervised learning method using the K-Means Clustering algorithm to group patients based on medical data without requiring prior disease labels. The dataset used consists of 300 simulated patient data (synthetic data) with variables of blood pressure, blood sugar, cholesterol, and symptoms of fever, cough, shortness of breath, and muscle pain. The results show that the model can divide patients into four main clusters: hypertension, diabetes, hypercholesterolemia, and respiratory infections, which are consistent with realistic clinical conditions. Analysis of the average feature per cluster, scatter plots, and heatmaps strengthen the interpretation of the characteristics of each group. This approach proves that the K-Means method can be an efficient early diagnostic tool even though the data is unlabeled.
Copyrights © 2026