Breast cancer is one of the diseases with the highest mortality rate in the world. There are two types of breast cancer, namely malignant and benign. Identification of the type allows for prevention and appropriate treatment before it spreads to other organs. Therefore, a large amount of breast cancer data classification analysis is needed. Data mining techniques, such as random forest, can be used because they are able to provide accurate predictions with a low error rate. The results of this study indicate that *Random Forest is an effective and accurate method for breast cancer classification with an accuracy of 95% and an AUV-ROC value of 0.99 and a recall of 97% which shows the model's ability to distinguish the two types of breast cancer very well so that it can reduce the risk. The use of the 5-Fold Cross-Validation technique) ensures that the results obtained are stable and do not depend on certain data divisions, thereby increasing the generalization of the model. Experiments on various parameters (n_estimators, max_depth, training data size) show that the best configuration is n_estimators = 100 and max_depth = 10, which provides the optimal balance between accuracy and model complexity. This model can be applied in a **Medical Decision Support System* to assist doctors in *early detection of breast cancer*, thereby increasing the speed and accuracy of diagnosis.
Copyrights © 2025