Claim Missing Document
Check
Articles

Found 1 Documents
Search

Understanding Transformers: A Comprehensive Review Rahmadhani, Berlina; Purwono, Purwono; Safar Dwi Kurniawan
Journal of Advanced Health Informatics Research Vol. 2 No. 2 (2024)
Publisher : Peneliti Teknologi Teknik Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.59247/jahir.v2i2.292

Abstract

Transformers have been recognized as one of the most significant innovations in the development of deep learning technology, with widespread application to Natural Language Processing (NLP), Computer Vision (CV), and multimodal data analysis. The self-attention mechanism, which is at the core of this architecture, is designed to capture global relationships in sequential and spatial data in parallel, enabling more efficient and accurate processing than Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN)-based approaches. Models such as BERT, GPT, and Vision Transformer (ViT) have been used for a variety of tasks, including text classification, translation, object detection, and image segmentation. Although the advantages of this model are significant, the high computing power requirements and reliance on large datasets are major challenges. Efforts to overcome these limitations have been made through the development of lightweight variants, such as the MobileViT and Swin Transformer, which are designed to improve efficiency without sacrificing accuracy. Further research is also directed at the application of transformers for multimodal data and specific domains, such as medical image analysis. With its high flexibility and adaptability, transformers continue to be regarded as a key component in the development of more advanced and far-reaching artificial intelligence.