Journal of Applied Data Sciences
Vol 7, No 2: May 2026

Utilization of K-means Clustering for Classifying Diabetes Risk Populations According to Health Behaviors and 3Es-2Ss Health Literacy

Supaporn Yodmunee (Unknown)
Wongpanya S. Nuankaew (Department of Computer Science, School of Information and Communication Technology, University of Phayao, Phayao, 56000 Thailand)
Thapanapong Sararat (Department of Computer Graphics and Multimedia, School of Information and Communication Technology, University of Phayao, Phayao, 56000 Thailand)
Pratya Nuankew (Department of Digital Business, School of Information and Communication Technology, University of Phayao, Phayao, 56000 Thailand)



Article Info

Publish Date
05 Apr 2026

Abstract

This study focused on classifying populations at risk for diabetes using K-means clustering integrated with the 3Es–2Ss health literacy framework: eating, exercise, emotion, smoking cessation, and alcohol cessation. Biological, behavioral, and health literacy data were analyzed. The dataset was collected from 126 participants identified as at-risk individuals in Ngao District, Lampang Province, Thailand. This relatively small, community-based sample provides valuable insights into local health behaviors but limits the generalizability and statistical power of the findings to broader populations. The K-means clustering analysis, guided by the Elbow method, identified k = 4 as the optimal number of clusters, yielding four distinct groups with different socio-demographic and health characteristics. These clusters revealed variations in health profiles, economic status, and behavioral literacy within the Thai population. Despite the small sample size and limited generalizability, missing data and inconsistencies were systematically addressed through data cleaning and normalization to maintain analytical reliability. The results suggest that K-means clustering can serve as an effective decision-support tool for public health planning, particularly for Non-Communicable Disease (NCD) prevention and diabetes management at the local level.

Copyrights © 2026






Journal Info

Abbrev

JADS

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes ...