The early and accurate prediction of HIV/AIDS infection is critical to improving clinical decision-making and ensuring effective patient management. This study presents a comprehensive machine learning-based approach to predict HIV/AIDS infection status and evaluate the effectiveness of antiretroviral treatments using a well-documented clinical dataset from 1996, comprising 2,139 patient records and 34 features. Through rigorous preprocessing, exploratory data analysis, and feature engineering, several new clinically relevant attributes were constructed, such as CD4/CD8 ratios and immunological change metrics. Four machine learning models—Logistic Regression, Support Vector Machine, Random Forest, and Gradient Boosting—were trained and evaluated. Among these, the Gradient Boosting classifier achieved the highest ROC-AUC score of 0.9335, while Random Forest provided strong predictive performance with a ROC-AUC of 0.9180 and was selected for further evaluation due to its model transparency. Key features influencing infection prediction included CD4+ and CD8+ dynamics, baseline immunological levels, and treatment history. Additionally, the study examined treatment effectiveness by analyzing CD4+ cell count responses across different therapy types. The combination of ZDV and ddI emerged as the most effective regimen, improving immune outcomes and lowering infection rates, while ZDV monotherapy showed the least favorable results. This work underscores the potential of machine learning as a clinical decision support tool in HIV/AIDS care and provides data-driven insights into treatment optimization. Future studies should incorporate longitudinal patient data and real-world clinical environments for broader applicability.
Copyrights © 2025