Journal of Applied Data Sciences
Vol 5, No 3: SEPTEMBER 2024

Breast Cancer Prediction Using Metrics-Based Classification

Armoogum, Sheeba (Unknown)
Dewi, Deshinta Arrova (Unknown)
Kezhilen, Motean (Unknown)
Trinawarman, Dedi (Unknown)



Article Info

Publish Date
23 Sep 2024

Abstract

Breast cancer remains the most prevalent form of cancer among women, with rising mortality rates worldwide. Early detection and accurate classification are crucial for improving patient outcomes, but manual detection methods are often time-consuming, complex, and prone to inaccuracies. This study aims to develop a machine learning (ML)-based desktop application to automate the detection and classification of breast cancer, thereby improving the efficiency and accuracy of diagnosis. Various ML algorithms, including Random Forest, Decision Tree, Support Vector Machine, Logistic Regression, Gaussian Naive Bayes, and K-nearest Neighbors, were employed to build classification models. The Wisconsin Diagnostic Breast Cancer (WDBC) dataset was used, and pre-processing techniques such as data cleaning, over-sampling, and feature selection were applied to optimize model performance. Experimental results demonstrate that the Random Forest classifier outperformed the other models, achieving an accuracy of 95.54%, precision of 96.72%, recall (sensitivity) of 95.16%, specificity of 96%, and an F1-score of 95.93%. These results highlight the potential of ML techniques in enhancing breast cancer diagnosis by offering a more reliable and efficient classification process. Future work could focus on improving feature selection techniques and applying the model to more diverse datasets for broader applicability.

Copyrights © 2024






Journal Info

Abbrev

JADS

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes ...