Sepsis is one of the leading causes of death in intensive care units. Many patients do not receive timely or effective treatment, which lowers their chances of survival. We developed a reinforcement learning–based framework to provide personalised treatment recommendations for sepsis patients. The model creates simple patient representations from treatment responses, groups patients with similar patterns, and learns the best treatment policy for each group. To reduce long training time, we use parallel and distributed computing. Using the MIMIC-III database and off-policy evaluation with weighted importance sampling, our method achieves a policy value of 79.933, higher than the clinician policy (47.654) and a general AI policy (57.658). A higher policy value indicates a lower mortality risk. These results show that our method can support faster, more accurate, and more effective treatment decisions in the ICU.
Copyrights © 2026