Garuda - Garba Rujukan Digital

Article Per Year (5 Year)

p-Index From 2020 - 2025

0.23

P-Index

This Author published in this journals

All Journal Jurnal Teknik Informatika (JUTIF)

Muazam, Safa

Unknown Affiliation

Author-ID : 9137672

Computer Science & IT

Published : 1 Documents Claim Missing Document

Claim Missing Document

Articles

WEB-BASED IMAGE CAPTIONING FOR IMAGES OF TOURIST ATTRACTIONS IN PURBALINGGA USING TRANSFORMER ARCHITECTURE AND TEXT-TO-SPEECH Muazam, Safa; Kurniawan, Yogiek Indra; Iskandar, Dadang
Jurnal Teknik Informatika (Jutif) Vol. 5 No. 5 (2024): JUTIF Volume 5, Number 5, Oktober 2024
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2024.5.5.2585

Purbalingga is a region located in Central Java Province, offering interesting natural beauty and tourist destinations. Many tourists capture their moments in photos, which are then uploaded to social media. However, a picture can contain a lot of information, and each individual may interpret it differently. Without captions, people may struggle to extract this information. Image captioning addresses this challenge by automatically generating text descriptions for images. Additionally, text-to-speech is used to enhance accessibility for the visually impaired in understanding image descriptions. This research aims to develop an image captioning model for images of tourist attractions in Purbalingga using transformer architecture and ResNet50. The transformer architecture employs an attention mechanism to learn the context and relationships between inputs and outputs, while ResNet50 is a robust convolutional network for image feature extraction. Model evaluation using BLEU metrics, which compare generated sentences to reference sentences, shows the best results as BLEU-{1, 2, 3, 4} = {0.672, 0.559, 0.489, 0.437}. Experiments indicate that increasing embeddings and layers extends training time and lowers BLEU scores, while changing the number of heads has minimal impact on results. The best model is implemented in a web-based application using the SDLC waterfall method, Flask framework, and MySQL database. This application allows users to upload tourist attraction images, receive automatic descriptions in Indonesian, and listen to the captions read aloud using the Web Speech API-based text-to-speech feature. Blackbox testing results show valid outcomes for all tests, indicating that the application operates as required and is suitable for use.

Co-Authors Dadang Iskandar, Dadang Kurniawan, Yogiek Indra

Title

Found 1 Documents
Search

Abstract

Title Search

Found 1 Documents Search

Abstract

Title

Found 1 Documents
Search