Claim Missing Document
Check
Articles

Found 1 Documents
Search

Breast Cancer Classification Using z-score Thresholding and Machine Learning Yildirim, Mustafa Eren; Salman, Yucel B.
Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol 7 No 4 (2025): October
Publisher : Department of Electromedical Engineering, POLTEKKES KEMENKES SURABAYA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/jeeemi.v7i4.1165

Abstract

Image processing and machine learning are being used in biomedical applications as supporting tools for the detection and diagnosis of certain diseases. Breast cancer is one of these diseases that researchers have devoted great effort to for decades. To accomplish this task, image-based and feature-based public datasets are available for use. Due to several factors such as hardware limitations or preprocessing, images can become noisy. The noise in images, which can lead to anomalies or outliers in the dataset, may decrease detection accuracy and mislead medical staff during the diagnostic stage. Therefore, this study aims to present the effect of removing outliers from the dataset on the detection accuracy of breast cancer. The proposed method removes outliers detected through z-score analysis. The remaining data are normalized, and the classification accuracies of ten methods are obtained through direct implementation. The methods include XGBoost, Neural Network, CNN, RNN, AdaBoost, LSTM, GRU, Random Forest, SVM, and Logistic Regression. The public dataset Wisconsin Diagnostic Breast Cancer (WDBC) was used in this study. An ablation study was conducted by fine-tuning the threshold value of the z-score method. The results showed that the best accuracy was obtained when the threshold value was set to 3. Additionally, a comparison was made between the results obtained using the entire dataset and the dataset after outlier removal. The results showed that the average accuracy of all classifiers was 98.08%. In conclusion, the findings indicate that removing outliers from the dataset increases the overall accuracy of breast cancer detection