Claim Missing Document
Check
Articles

Found 1 Documents
Search

Feature importance of using explanaible artificial intelligence (xai) and machine learning for diabetes disease classification Ahmad, Muhammad Maulana; Sulistianingsih, Neny; Hidjah, Khasnur
Journal Of Information System And Artificial Intelligence Vol. 6 No. 1 (2025): Vol. 6 No.1(2025): Journal of Information System and Artificial Intelligence Vo
Publisher : Universitas Mercu Buana Yogyakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26486/jisai.v6i1.252

Abstract

Diabetes is one of the most significant global health problems in the modern era. This disease not only has a serious impact on the quality of life of sufferers, but also poses a great economic and social burden, both for individuals and the health service system as a whole. Therefore, early detection and effective treatment are very important in an effort to reduce the prevalence and negative impact of this disease. Therefore, the purpose of this study is to design a machine learning classification model that is able to identify feature importance with the help of the Explainable Artificial Intelligence (XAI) method in the case of diabetes. This model is expected to provide a clear interpretation of the most relevant features or symptoms, making it easier to detect whether a person has diabetes or not based on the symptoms that have been selected more optimally. The results of this study in the treatment or prediction of diabetes show that the results of the selection of LIME model features are higher than the accuracy of the SHAP model, where the highest is the LIME model which is processed using classification using the XGBoost algorithm with an accuracy of 98.47%, in addition to the LIME model using the Decisien Tree and Random Forest algorithms producing an accuracy of 91.97% and 91.49%, respectively. then the SHAP model using the XGBoost algorithm produced an accuracy of 0.9094%, the Decisien Tree algorithm produced an accuracy of 0.8059% and the Random Forest produced an accuracy of 88.46%, with the amount of data used as many as 70000 data, with 80% training data and 20% test data. The findings of this study are that the LIME feature selection combined with the XGBoost classification method has the best accuracy rate of 98.47% compared to the SHAP feature selection which is the same in combination with XGBoost with an accuracy of 90.94%. These findings also show that the selection of LIME features combined with the XGBoost algorithm is able to improve the interpretability of the model as well as maintain or even improve the accuracy of the predictions. This approach allows for the identification of the most relevant features more efficiently, thus supporting more informed decision-making in the data analysis process