This study aims to utilize machine learning techniques to predict Non-Communicable Diseases (NCDs) in Uganda, facilitating preventative actions by analyzing locally obtained data on risk factors and symptoms. A locally created dataset comprising NCDs, risk factors, and symptoms reported by medical practitioners was employed to frame NCD prediction as a classification problem. Three distinct models were developed: the first model utilized only risk factors, the second model focused solely on symptoms, and the third model integrated both risk factors and symptoms. Various machine learning classifiers, including K-Nearest Neighbours (KNN), Random Forest, Support Vector Machine (SVM), Artificial Neural Network (ANN), Naïve Bayes, and XGBoost, were applied to each model to assess their predictive performance. The study results indicated that KNN, was the best at predicting NCDs basing on risk factors only, while SVM was the least effective. Using symptoms to predict NCDs, ANN and Naïve Bayes emerged the best, and KNN the weakest. Using risk factors and symptoms, Random Forest was the best prediction technique while KNN was again the least effective classifier. In conclusion, this study provides valuable insights into into the comparative performance of various machine learning classifiers to model and predict NCDs using locally relevant data in Uganda. The findings underscore the importance of accurately predicting NCDs at early stages, enabling medical personnel to intervene and offer preventive treatments to high-risk individuals. The identification of the most effective classifiers paves the way for future research and implementation initiatives in low- and middle-income countries.
Copyrights © 2024