Yudha, M. Zahran
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : Malcom: Indonesian Journal of Machine Learning and Computer Science

Comparison and Data Visualization in Thyroid Cancer Disease Prediction Using Machine Learning Algorithms Yudha, M. Zahran; Jasmir, Jasmir; Fachruddin, Fachruddin
MALCOM: Indonesian Journal of Machine Learning and Computer Science Vol. 6 No. 1 (2026): MALCOM January 2026
Publisher : Institut Riset dan Publikasi Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.57152/malcom.v6i1.2249

Abstract

Thyroid cancer is a common endocrine malignancy requiring accurate early prediction for improved patient outcomes. Comprehensive comparative studies of machine learning algorithms, accompanied by systematic visualization, remain limited. This study compares tree-based algorithms (Decision Trees, Random Forest) and boosting algorithms (Gradient Boosting, XGBoost) for thyroid cancer prediction and develops visualization strategies for clinical interpretation. Four algorithms were evaluated using accuracy (correct prediction proportion), precision (positive predictive value), recall (true positive rate), F1-score (harmonic mean of precision and recall), and AUC-ROC (area under the ROC curve). Visualization techniques, including confusion matrices, ROC curves, and feature importance plots, facilitated the interpretation of the model. XGBoost achieved superior performance with accuracy 95.2%, precision 94.8%, recall 95.6%, F1-score 95.2%, and AUC-ROC 0.978, followed by Random Forest (93.5%, 92.7%, 94.1%, 93.4%, 0.965), Gradient Boosting (91.8%, 90.9%, 92.4%, 91.6%, 0.952), and Decision Trees (87.3%, 86.5%, 88.2%, 87.3%, 0.913). Feature importance analysis identified key predictors. Boosting algorithms, particularly XGBoost, demonstrate superior thyroid cancer prediction across all metrics. Integrated visualization enhances clinical interpretability, providing empirical guidance for implementing machine learning-based diagnostic support systems.