This research evaluates the performance of two Transformer models, the Vision Transformer (ViT) and Swin Transformer, in the analysis of thoracic X-ray images. The study's objective is to determine whether Transformer models can enhance diagnostic accuracy for lung diseases, considering challenges such as early symptom variability and similar radiological signs. The dataset includes 21,165 X-ray images, featuring 3,616 COVID-19 cases, 10,192 normal images, 6,012 images of Lung Opacity, and 1,345 pneumonia images. Model development involved tuning hyperparameters such as epoch numbers and optimizer choice. The results indicate that using the AdamW and Adamax optimizers achieves an optimal balance between computational efficiency and accuracy. The Swin Transformer model, using the Adamax optimizer, reached the highest testing accuracy of 96.10% in 33,802.70 seconds, while the Vision Transformer achieved a testing accuracy of 95.10% in 33,503.10 seconds.
Copyrights © 2024