Indonesian Journal of Data and Science
Vol. 4 No. 2 (2023): Indonesian Journal of Data and Science

Comparison Analysis of Classification Model Performance in Lung Cancer Prediction Using Decision Tree, Naive Bayes, and Support Vector Machine

Dewi Widyawati (Unknown)
Amaliah Faradibah (Unknown)
Lestari Lokapitasari Belluano, Poetri (Unknown)



Article Info

Publish Date
31 Jul 2023

Abstract

This research aims to analyze the performance of three classification models, namely Decision Tree Classifier, Support Vector Machine, and Naive Bayes Classifier, in predicting lung cancer using the "Lung Cancer Prediction" dataset. The performance evaluation metrics used include accuracy, precision weighted, recall weighted, and F1 weighted. As a preliminary step, exploratory data analysis (EDA) and dataset preprocessing, including feature selection, data cleaning, and data transformation, were conducted. The test data results showed that the Decision Tree Classifier and Naive Bayes Classifier had similar performances with high accuracy, precision, recall, and F1 values. Meanwhile, the Support Vector Machine also exhibited competitive performance, although its precision weighted value was slightly lower. Additionally, an outlier analysis was conducted using box plots, revealing that the Decision Tree Classifier had 2 outlier values, while the Support Vector Machine had 4 outlier values, and Naive Bayes had no outlier values. In conclusion, all three classification models demonstrated good potential in lung cancer prediction. However, selecting the best model requires consideration of relevant evaluation metrics for the application and accommodating the limitations of each model. Further evaluation and in-depth analysis are needed to ensure the reliability of the models in predicting lung cancer cases more accurately and consistently.

Copyrights © 2023






Journal Info

Abbrev

ijodas

Publisher

Subject

Computer Science & IT Decision Sciences, Operations Research & Management Mathematics

Description

IJODAS provides online media to publish scientific articles from research in the field of Data Science, Data Mining, Data Communication, Data Security and Data ...