Colon and lung cancers are two highly lethal kinds of cancer which can often coexist and pose a new challenge for accurate diagnosis. While research often concentrates on detecting a single cancer in a specific organ, this study proposes an innovative machine-learning approach to identify both colon and lung cancers. The objective is to create a hybrid machine learning classification model to enhance diagnostic precision. The LC25000 dataset comprises 25,000 color histopathological image samples of lung and colon cell tissues, indicating the presence or absence of cancer (adenocarcinoma). Image features are extracted using the pre-trained VGG-16 model. The cancer type is identified through three machine learning classification algorithms: Stochastic Gradient Descent (SGD), Random Forest (RF), and K-Nearest Neighbor (KNN). The model's evaluation employed a 10-fold cross-validation technique, with CNN-SGD exhibiting the highest performance based on evaluation metrics. On a scale of 0 to 100, it scored 99.8 for Area Under Curve (AUC) and 98.88 for Classification Accuracy (CA). CNN-RF, a model with performance closely following CNN-SGD, demonstrates training times 58.3 seconds faster than CNN-SGD. Meanwhile, CNN-KNN ranks last among the models evaluated in this study based on its F1, recall, AUC, and CA scores.
Copyrights © 2024