Diabetes Mellitus (DM) is a major global health concern, responsible for 6.7 million deaths in 2021, equivalent to one death every five seconds. In Indonesia, it was the third leading cause of death in 2019, with a mortality rate of approximately 57.42 per 100,000 people. This study focuses on developing a diabetes prediction model using machine learning, aiming for an accuracy of at least 85%, and incorporates a chatbot-based system to identify potential diabetes in women. The research utilizes primary data, including glucose levels, blood pressure, body mass index, and age, as well as secondary data, such as pregnancy-related metrics, from the UCI Pima Indians Diabetes Database, which contains 768 records with eight attributes. The study evaluates the performance of three machine learning algorithms: Decision Tree, Logistic Regression, and Random Forest, using metrics such as accuracy, precision, recall, and F1-score. Among these models, the Decision Tree demonstrates excellent performance for Class 0, with precision, recall, and F1-score all at 0.97. However, its performance for Class 1, while decent, leaves room for improvement, achieving a precision of 0.80 and a recall of 0.84, resulting in an F1-score of 0.82. Logistic Regression also performs well for Class 0, with a precision of 0.95 and a recall of 0.99, yielding an F1-score of 0.97. Yet, it struggles with Class 1, where its precision is high at 0.93, but its recall drops significantly to 0.68, producing an F1-score of 0.79. Lastly, Random Forest emerges as the best-performing model overall, achieving an accuracy of 0.96. It excels for Class 0, with a precision of 0.96 and a recall of 0.99, leading to an F1-score of 0.97. For Class 1, it maintains high precision (0.93) but exhibits moderate recall (0.74), resulting in an F1-score of 0.82.
Copyrights © 2024