This Author published in this journals
All Journal ISTEK
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Transforming Story Ideas from Images to Text Using Convolutional Neural Networks (CNN) and Generative Pre-trained Transformer (GPT-2) Rizqullah, Moh Hasbi; Nurlatifah, Eva; Budiawan Zulfikar, Wildan
ISTEK Vol. 14 No. 2 (2025)
Publisher : Fakultas Sains dan Teknologi UIN Sunan Gunung Djati Bandung

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15575/istek.v14i2.2599

Abstract

The gap between rich visual inspiration and the challenge of creative articulation (writer’s block) remains a major obstacle in the writing process. This study aims to bridge this gap by designing a two-stage artificial intelligence system based on deep learning to provide automated narrative stimuli. The proposed method implements a custom Convolutional Neural Network (CNN) architecture to detect seven classes of natural objects from 4,362 images. The detected objects are then used as prompts for a fine-tuned Generative Pre-trained Transformer (GPT-2) model to generate poetic narratives. Experimental results indicate that the CNN module achieved a peak classification accuracy of 61.96%. Confusion matrix analysis reveals that this limitation is not caused by overfitting, but rather by high inter-class visual ambiguity. Although the GPT-2 module is capable of generating narratives with a BERT Score F1 of up to 0.6455, the primary finding of this study is that the overall narrative quality is highly dependent on the accuracy of the CNN output, which acts as a critical bottleneck in the system.