Lung cancer poses a significant global mortality challenge, with early clinical detection hindered by non-specific symptoms making accurate diagnosis dependent on extracting subtle patterns from often complex medical tabular data. Traditional machine learning approaches often fall short in capturing intricate patterns within such heterogeneous datasets, hindering effective clinical decision support. This research introduces TabNet, an interpretable deep learning architecture, for multiclass lung cancer severity prediction (low, medium, high). Utilizing the Kaggle Lung Cancer dataset, our methodology leverages TabNet's unique attention-based feature selection for end-to-end processing of tabular data, enabling adaptive identification of key predictors and crucial model interpretability. To effectively assess its predictive capabilities and ensure robust performance, the model was trained with default configurations and validated through stratified 5-fold cross-validation, achieving outstanding performance on the test set: 98.50% accuracy, a 0.98 F1-score, and a 0.9996 macro-AUC-ROC. Beyond its robustness, confirmed by stable learning curves, interpretability analysis highlighted 'Genetic Risk' and 'Shortness of Breath' as dominant factors. Our results underscore TabNet's efficacy as a reliable, robust, and inherently interpretable solution, offering significant potential to improve the precision and transparency of lung cancer severity assessment in clinical practice.
Copyrights © 2025