Health screening in pesantren is challenging due to communal living conditions, limited health facilities, and the need for early identification of vulnerable student groups. This study compares the performance of K-Prototypes and K-Medoids clustering for grouping student health profiles and evaluates the use of cluster labels as additional features in a CatBoost classification model. The dataset consists of 1,464 new students from Queen Al Falah Islamic Boarding School in the 2025/2026 academic year, collected through the admission system and analyzed after preprocessing. Clustering is performed using K-Prototypes and K-Medoids with three clusters to support interpretability of nutritional and health profiles. Although two clusters yield higher silhouette values, three clusters provide more meaningful distinctions for practical screening. Classification experiments use CatBoost with an 80:20 stratified train-test split, comparing baseline models and hybrid models that integrate cross-algorithm cluster features. The results show an asymmetric pattern. Adding K-Prototypes features improves K-Medoids target accuracy from 99.66 percent to 100 percent, while adding K-Medoids features slightly decreases K-Prototypes target accuracy from 98.98 percent to 98.63 percent. McNemar test results indicate that these differences are not statistically significant. Overall, the proposed framework supports reliable and interpretable health profile clustering for pesantren student monitoring.
Copyrights © 2026