Breast cancer is one of the diseases that causes death and is one of the most frightening leading causes worldwide. This disease falls under the category of highly dangerous cancers, ranking second after lung cancer. Breast cancer cases occur in large numbers across various regions of the world, raising significant concerns globally. Breast cancer not only affects the quality of life of patients but also contributes significantly to the global cancer mortality rate. It ranks as the fifth leading cause of cancer-related deaths, accounting for approximately 16.6% of the total cancer deaths worldwide. In this study, a classification of blood sample data from breast cancer patients was conducted. Various classification techniques and methods were applied, including the K-Nearest Neighbor (KNN) method and Naïve Bayes (NB). To achieve accurate results, this study tested accuracy using Cross Validation techniques and a Confusion Matrix to evaluate the test data. Of the total 569 data points collected, 70% were used as training data, amounting to 398 data points, while the remaining 30%, or 171 data points, were used as test data. The results of this study showed that the Naïve Bayes method produced an accuracy rate of 96%, with a precision of 94% and a recall of 91%. On the other hand, the K-Nearest Neighbor method yielded a lower accuracy rate of 73%, with a precision of 74% and a recall of 66%, using K=7.
Copyrights © 2024