Akter, Alifa
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

An interpretable machine learning-based breast cancer classification using XGBoost, SHAP, and LIME Dutta, Monoronjon; Mehedi Hasan, Khondokar Md.; Akter, Alifa; Rahman, Md. Hasibur; Assaduzzaman, Md.
Bulletin of Electrical Engineering and Informatics Vol 13, No 6: December 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v13i6.7866

Abstract

Globally, breast cancer is among the most prevalent and deadly tumors that affect women. Early and accurate identification of breast cancer is essential for effective treatment planning and improving patient outcomes. This research focuses on improving breast cancer classification accuracy through machine learning (ML) methodologies, emphasizing interpretability. The study utilized the chi-square method to enhance model testing performance by pinpointing the most significant features for further analysis. The study also improved data quality by identifying and removing outliers, thus minimizing the influence of data irregularities on the performance of the models. For classification, the study evaluated six different ML algorithms—namely extreme gradient boosting (XGBoost), decision tree (DT), AdaBoost (AB), support vector machine (SVM), gradient boosting (GB), and K-nearest neighbors (KNN)—each applied to distinguish between the two variants of breast cancer. Among these, the XGBoost classifier emerged as the most accurate, achieving an impressive 99.30% accuracy rate. Moreover, the research incorporated shapley additive explanations (SHAP) and local interpretable model-agnostic explanations (LIME) methods to boost the interpretability of the proposed model, offering crucial insights into the model’s decision-making process. Applying these interpretability techniques provided significant insights into the predictive factors influencing healthcare outcomes, ensuring the classification approach’s transparency and reliability.