Claim Missing Document
Check
Articles

Found 1 Documents
Search

Image Caption Generator Using Bahdanau Attention Mechanism Gowda , Nikhita B; Vaishnavi; Skanda B N , Avin; Rohan M; Raikar , Pratheek V
International Journal of Advanced Science Computing and Engineering Vol. 7 No. 3 (2025)
Publisher : SOTVI

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62527/ijasce.7.3.264

Abstract

This project proposes a sophisticated image captioning system developed using an encoder-decoder framework bolstered with an attention mechanism. The system generates contextually appropriate text descriptions by dynamically weighting relevant image regions with CNNs for feature extraction and RNNs with attention layers. The model shows significant improvement on the Flickr8k dataset, as measured by BLEU. The study examines the use of such systems across domains, including assistive devices and automated indexing, and proposes employing transformer-based attention methods in future upgrades. The development of an image captioning system with an attention mechanism is a key advancement in computer vision and natural language processing. This mechanism helps the model focus on relevant image parts when generating words, improving contextual relevance and semantic accuracy. It aligns visual features with language more effectively, producing captions similar to human descriptions. The model employs teacher forcing during training to accelerate learning and improve fluency. Standard metrics like BLEU evaluate performance and compare models. Inspired by works like “Show, Attend and Tell,” attention bridges image features and language. Attention-based captioning can aid visually impaired users, enable content indexing, and improve human–computer interaction. Future research will likely scale models to larger datasets and enhance generalization across diverse scenes.