Purpose – This study aims to develop a data-driven obesity classification framework that integrates genetic predisposition and lifestyle determinants using the Naive Bayes algorithm, while empirically evaluating optimal training–testing data proportions for health decision support systems.Methods – A systematic computational workflow was applied to a public obesity dataset comprising 2,112 records, which was refined to 1,259 valid instances after preprocessing. Genetic indicators and lifestyle-related variables were encoded and classified into four obesity categories: normal weight, obesity type I, obesity type II, and obesity type III. The Naive Bayes model was evaluated using three training–testing data partition ratios (75:25, 80:20, and 85:15). Model performance was assessed using six metrics: Area Under the Curve (AUC), classification accuracy, F1-score, precision, recall, and Matthews Correlation Coefficient.Findings – The results demonstrate that the 80:20 and 85:15 data partitions achieved the highest performance, with an accuracy of 0.878 and an AUC of 0.979. The model showed excellent sensitivity in identifying severe obesity cases, while moderate misclassification occurred between obesity type I and type II due to phenotypic overlap in lifestyle patterns.Research limitations – This study relies on a single public dataset and lacks population-specific genetic calibration, which may limit generalizability to diverse regional contexts.Originality – This study provides empirical validation of a probabilistic obesity classification framework that integrates genetic and lifestyle factors, offering an interpretable and computationally efficient approach to support data-driven health decision making.