ILKOMNIKA: Journal of Computer Science and Applied Informatics
Vol 8 No 1 (2026): Volume 8, Number 1, April 2026

Lifestyle-Based Obesity Risk Clustering Using Ward Hierarchical Clustering

Aunilla, Moch. Fikri (Unknown)
Mufliq, Achmad (Unknown)
Nugroho, Rizky Aditya (Unknown)



Article Info

Publish Date
30 Apr 2026

Abstract

Obesity is a growing public health problem influenced by multiple interacting lifestyle behaviors that cannot always be adequately captured by single-factor or label-driven analysis. Therefore, this study applies an unsupervised clustering approach to identify natural behavioral segments associated with obesity risk. The data were obtained from the Poltekkes Kemenkes Semarang Obesity Risk Dataset, consisting of 20,758 records and 16 mixed attrib`utes (numerical and categorical) with the NObeyesdad label. Pre-processing included standardizing numerical features and one-hot encoding categorical features, followed by dimensionality reduction using PCA to 13 components retaining approximately 95.48% of the variance. Ward clustering was applied in the PCA space, and the number of clusters was tested for k=2–10 using the Silhouette coefficient, Davies–Bouldin Index (DBI), and Calinski–Harabasz (CH) index. Although the average Silhouette coefficient was modest (≈0.2029), the k=5 solution was retained because it offered the best balance between internal validation results and the practical interpretability of cluster-based risk profiles. BMI-based interpretation using WHO Asia criteria identified Cluster 0 as very high risk (mean BMI 33.04; 73.7% obese), Cluster 2 as high risk characterized by predominant smoking (65.3% obese), Cluster 4 as moderate-to-high risk (34.0% overweight; 30.3% obese), Cluster 1 as a mixed group, and Cluster 3 as relatively low risk (mean BMI 22.21; 8.9% obese). Agreement between clusters and the label was low (NMI 0.126; ARI 0.073), indicating that the clusters represent similarity in behavioral patterns rather than the label classes.

Copyrights © 2026






Journal Info

Abbrev

ilkomnika

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

ILKOMNIKA: Journal of Computer and Applied Informatics is is a peer reviewed open-access journal. The journal invites scientists and engineers throughout the world to exchange and disseminate theoretical and practice-oriented topics of computer science and applied informatics which covers five (5) ...