Diabetes drug discovery remains slow, costly, and insufficiently personalised, particularly in resource-constrained healthcare settings. This study proposes and empirically evaluates a personalised, generative artificial intelligence framework that integrates molecular and clinical data to generate diabetes drug candidates. Guided by the CRISP-DM framework, a hybrid Clinical–Molecular Variational Autoencoder (VAE) architecture was developed, combining molecular representations with anonymised patient metabolic profiles, including HbA1C, fasting glucose, BMI, cholesterol, and age. Molecular data were sourced from PubChem and ChEMBL, and generated compounds were evaluated using drug-likeness metrics, molecular validity checks, and downstream effectiveness classification. The model successfully generated chemically valid, drug-like molecules with average Quantitative Estimate of Drug-likeness (QED) scores above 0.5. At a fixed decision threshold, effectiveness classification achieved an accuracy of 0.80; however, probability calibration analysis revealed limited discriminative reliability across thresholds (AUC = 0.49), highlighting the impact of class imbalance. Unlike prior molecule-centric generative drug discovery approaches, this study presents one of the first empirically evaluated Clinical–Molecular dual-VAE frameworks for personalised diabetes drug discovery, explicitly integrating patient metabolic profiles while revealing calibration limitations in generative pharmaceutical pipelines.
Copyrights © 2026