Lung diseases, including lung cancer, are one of the leading causes of death in the world. Early detection is essential to increase patients' chances of recovery and reduce healthcare costs. The utilization of machine learning algorithms can be used to solve this problem. This study evaluates five machine learning algorithms, namely K-Nearest Neighbors (K-NN), Naïve Bayes Classifier (NBC), Decision Tree (DT), Random Forest (RF), and Support Vector Machine (SVM), for lung disease prediction using a dataset of 30,000 data with 11 attributes from Kaggle. The dataset was processed through data preprocessing and divided into training and test data with a ratio of 70%:30% and 80%:20%. The algorithm performance was evaluated using precision, recall, F1-score, and accuracy metrics. The results show that RF, SVM, and DT algorithms have the highest performance, with accuracy reaching 94.72% at 70%:30% ratio. The DT algorithm, which previously showed low performance in heart disease classification, provided competitive results in lung disease prediction. This research focuses on the importance of proper algorithm selection and data organization to improve the effectiveness of disease prediction. The findings contribute to the development of artificial intelligence technology for medical applications, particularly in supporting early diagnosis of lung diseases.
Copyrights © 2025