Lung diseasesĀ are among the most important causes of morbidity and mortality worldwide; it require prompt and accurate diagnosis methods. A novel hybrid deep learning framework for integrating you only look once version 8 (YOLOv8), considering real-time detection and vision transformer (ViT-B/16) for global context-based classification of lung diseases in chest X-ray images, is presented. Based on transfer learning and a two-stage detection-classification pipeline, this proposed model is applicable to dealing with inter-image variability, overlapped disease features and lack of annotated medical examples. Our developed hybrid model achieves the highest classification accuracy of 96.8% and 0.98 AUC-ROC on the National Institutes of Health (NIH) Chest X-ray dataset, which consists of over 112,000 images covering 14 diseases, and outperforms its several current state-of-the-art models. In addition, attention heatmaps and bounding box visualizations highly correlate with clinical variables and enhance interpretability. This paper demonstrates the practicability of hybrid vision driven architectures for better medical image analysis and shows their integration into clinical decision-support systems.
Copyrights © 2025