The early diagnosis of breast cancer is a critical factor in improving recovery rates and reducing cancer-related mortality. This study aims to compare the performance of two widely used machine learning algorithms in medical data classification Naive Bayes and Decision Tree in detecting breast cancer using the Breast Cancer Wisconsin (Diagnostic) dataset. The dataset consists of 569 samples with 30 numerical features and one target label. The methodology includes data preprocessing, model training, and performance evaluation using six metrics: accuracy, precision, recall, F1-score, AUC, and MCC. Naive Bayes achieved higher performance, with 96.5% accuracy, 97.6% precision, 93.0% recall, 95.2% F1-score, 0.997 AUC, and 0.925 MCC, compared to Decision Tree with 93.9% accuracy, 90.9% precision, 93.0% recall, 92.0% F1-score, 0.936 AUC, and 0.87 MCC. Confusion matrix and ROC curve analyses support these results, particularly in minimizing classification errors. While Decision Tree offers better interpretability, Naive Bayes may be more suitable for early breast cancer detection under similar dataset conditions. Future studies could explore ensemble approaches to combine the strengths of both methods.
Copyrights © 2025