Darmawan Bakti, Lalu
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

EFFICIENT HYBRID CNN-VISION TRANSFORMER FOR MEDICAL IMAGE CLASSIFICATION WITH LIMITED ANNOTATIONS Sudirman, San; Yani, Ahmad; Darmawan Bakti, Lalu
Jurnal Kecerdasan Buatan dan Teknologi Informasi Vol. 4 No. 3 (2025): September 2025
Publisher : Ninety Media Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.69916/jkbti.v4i3.453

Abstract

Medical image classification is a critical component of computer-aided diagnosis systems, yet its performance is often hindered by the scarcity of annotated data. This situation is common in the medical domain due to ethical, cost, and labeling constraints. Convolutional Neural Networks (CNNs) are effective at extracting local features but are suboptimal at capturing global context. Conversely, Vision Transformers (ViTs) excel at modeling long-range dependencies but require large amounts of training data. To address these limitations, this study proposes a hybrid CNN–Vision Transformer model that integrates the strengths of both to improve classification performance under limited annotation conditions. The model was tested using the OrganAMNIST dataset, consisting of 53,339 two-dimensional abdominal CT images with 11 organ classes. Experimental results show that the model achieves an accuracy of 92.3%, an F1-score of 91.8%, and an AUC of 99.5%, with only 3.67 million parameters. Compared to ResNet50, this model reduces the number of parameters by 84% and increases inference speed by up to 2.4 times. Additionally, the model demonstrates better training stability compared to baseline models such as ResNet50 and ViT-Small. The results of the study show that the integration of local and global features in a hybrid architecture can simultaneously improve accuracy and efficiency. This approach has the potential to be applied to medical diagnosis systems with limited data and computational resources.