Bulletin of Electrical Engineering and Informatics
Vol 14, No 4: August 2025

Exploring deep learning approaches for image captioning to mimic human understanding

Islam, Maheen (Unknown)
Hassan Ratul, Mahedi (Unknown)
Haque, Rezaul (Unknown)
Hossain Rony, Sazzad (Unknown)
Huq Asif, Azharul (Unknown)
Mittra, Tanni (Unknown)
Miskat Hossain, Md (Unknown)
Hasan, Mahamudul (Unknown)



Article Info

Publish Date
01 Aug 2025

Abstract

Image captioning has emerged as a vital research area in computer vision, aiming to enhance how humans interact with visual content. While progress has been made, challenges like improving caption diversity and accuracy remain. This study proposes transfer learning models and RNN algorithms trained on the microsoft common objects in context (MS COCO) dataset to improve image captioning quality. The models combine image and text features, utilizing ResNet50, VGG16, and InceptionV3 with LSTM, and BiLSTM. Performance is measured using metrics such as BLEU, ROUGE, and METEOR for greedy and beam search. The InceptionV3+BiLSTM model outperformed others, achieving a BLEUscore of over 60%, a METEORscore of 28.6%, and a ROUGEscore of 57.2%. This research contributes to building a simple yet effective image captioning model, providing accurate descriptions with human-like understanding. The error was analyzed to improve results while discussing ongoing research aimed at enhancing the diversity, fluency, and accuracy of generated captions, with significant implications for improving the accessibility and searchability of visual media and informing future research in this area.

Copyrights © 2025






Journal Info

Abbrev

EEI

Publisher

Subject

Electrical & Electronics Engineering

Description

Bulletin of Electrical Engineering and Informatics (Buletin Teknik Elektro dan Informatika) ISSN: 2089-3191, e-ISSN: 2302-9285 is open to submission from scholars and experts in the wide areas of electrical, electronics, instrumentation, control, telecommunication and computer engineering from the ...