Claim Missing Document
Check
Articles

Found 2 Documents
Search
Journal : Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control

The Evolution of Image Captioning Models: Trends, Techniques, and Future Challenges Bastian, Ade; Wahid, Abrar; Hafsari, Zacky; Mardiana, Ardi
Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control Vol. 10, No. 4, November 2025
Publisher : Universitas Muhammadiyah Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.22219/kinetik.v10i4.2305

Abstract

This study provides a comprehensive systematic literature review (SLR) of the evolution of image captioning models from 2017 to 2025, with a particular emphasis on the impending problems, methodological enhancements, and significant architectural developments. The evaluation is guided by the increasing demand for precise and contextually aware image descriptions, and it adheres to the PRISMA methodology. It selects 36 relevant papers from reputable scientific databases. The results indicate a significant transition from traditional CNN-RNN models to Transformer-based architectures, which leads to enhanced semantic coherence and contextual comprehension. Current methodologies, such as prompt engineering and GAN-based augmentation, have further facilitated generalization and diversity, while multimodal fusion solutions, which incorporate attention mechanisms and knowledge integration, have improved caption quality. Additionally, significant areas of concern include data bias, equity in model assessment, and support for low-resource languages. The study underscores the fact that modern vision-language models, such as Flamingo, GIT, and LLaVA, offer robust domain generalization through cross-modal learning and joint embedding. Furthermore, the efficacy of computing in restricted environments is improved by the development of pretraining procedures and lightweight models. This study contributes by identifying future prospects, analyzing technical trade-offs, and delineating research trends, particularly in sectors such as healthcare, construction, and inclusive AI. According to the results, in order to optimize their efficacy in real-world applications, future picture captioning models must prioritize resource efficiency, impartiality, and multilingual capabilities.
Maleo Emotion Audio Dataset Indonesia For Emotion Classification Mardiana, Ardi; Permana, Sri Mentari Widya Ningrum; Ii Sopiandi; Ade Bastian; Irawan, Eka Tresna
Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control Vol. 11, No. 2, May 2026 (Article in Progress)
Publisher : Universitas Muhammadiyah Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.22219/kinetik.v11i2.2474

Abstract

The limited availability of voice emotion datasets in Indonesian poses a challenge in the development of Speech Emotion Recognition (SER) systems, even though the need for such systems continues to grow in various sectors such as customer service, education, and human-computer interaction. To address this challenge, this study developed the Maleo Emotion Audio Dataset, a collection of three-second audio clips labeled with seven emotion categories: angry, neutral, disgusted, sad, happy, afraid, and surprised. The data was collected from the YouTube platform, and the Maleo Emotion Dataset is available at https://huggingface.co/datasets/maleo-ai/maleo-emotion. It was processed through preprocessing, feature extraction, and augmentation stages. The five main features extracted include Zero Crossing Rate, energy, Mel-Frequency Cepstral Coefficients (MFCC), spectral roll-off, and spectral flux. To enhance generalization, augmentation techniques such as pitch shifting, noise injection, and time stretching were applied. The classification model was built using a Convolutional Neural Network (CNN) architecture with TensorFlow-based implementation. Evaluation showed that the model achieved 94.48% accuracy on the test data, with balanced performance across all emotion categories. These results demonstrate that the developed dataset and model architecture have high capability in effectively recognizing emotions from Indonesian speech in a locally relevant context.
Co-Authors Abrar Wahid Abu Bakar, Abib Maftuh ade rahmawati Adnan Arshad Ai Komariah Alam, Muhammad Quthbul Aldri Frinaldi Ano Tarsono Ardi Mardiana Ardi Mardiana Arif Yusuf Budiman Aripin, Ali Maulana Hapid Arshad, Adnan Asep Rachmat Asyhari, Muhammad Fiddiana Azkiya, Muhammad Azkal Badhel, Yasser Gibran Berliani, Mega Billy Adrian Fernanda Budiman Budiman Cesoria, Yola Zerlinda Dadan Romadhoni Dadan Zaliluddin Dadan Zaliluddin Destiani, Putri Dety Sukmawati Devi Sukrisna Diana Surya Heriyana Didin Rudini Didin Rudini Dimas, Fadli Dinda Sri Wulansari Dony Susandi Eka Tresna Irawan Erdiyanti, Yucky Putri Fahmi Aziz, Muhamamad Fernanda, Billy Adrian Firmansyah, Mochammad Bagasnanda Fitriani, Nadila Fitriyani, Rofi Hafsari, Zacky Haq, Rosdiana Harti, Adi Oksifa Rahma Harun Sujadi Hermawan, Dicky Ida Marina Ii Sopiandi, Ii Imas Naimah Hasnah Indra Permana, Indra Indradewa, Rhian Irawan, Eka Tresna Jabbar, Fathir Abdul Khoerunissa, Salsa Koswara, Engkos Kovertina Rakhmi Indriana Kusumadewi, Intan Latiful Abror Lia Milana Lidya Tresna Wahyuni Mega Berliani Miftahuddin Al-Aziz Mochammad Bagasnanda Firmansyah Mochammad Bagasnanda Firmansyah Muhamamad Rifki Muhammad Fahmi Ajiz Muhammad Iqbal Rizmaya Muhammad Iqbal Rizmaya Muhammad Rifki Muhammad Rifki Muhammad Syifa Al Maroghi Muhammad Taufiq Muhammad Taufiq Mukhlis Muslimah, Dinda Desmonda Nadya Pratiwi Aisha Bakhtiar Nana Sutrisna Nana Sutrisna Nia Kurniati Nisa Brian Sulaeman Nugraha, Algi Nugraha, Faisol Nugraha, Rezha Nunu Nurdiana, Nunu Nurfajriah, Riska Nurhilda, Pebby Nurhimah, Enung Pangarsi Dyah Kusuma Wardani, Siti Pangestu, Arki Aji Pauzan, Muh Permana, Iip Indra Permana, Sri Mentari Widya Ningrum Prahara, Ervin Gusti Dwi Priyadi, Deni Purnama, Crisda Putra, Agam Maulana Rahayu, Syifaa Puspita Riepah, Ipah Rifki, Muhamamad Riki Riyanto Riri Nurazizah Ristina Siti Sundari Rivki Anja Afrenda Rohmanudin, Wildan Rusmanto, Ayu Hafidzah Rusyn, Volodymyr Safari Yonasi Salwa, Alya Jihan Sandi Fajar Rodiansyah Sarmidi Sarmidi Sarmidi Sarmidi Satria Winata Sidik Zapar Sidik Sudjana, Muhammad Ridwan Shaleh Tantri Wahyuni Tika Sifana Tri Ferga Prasetyo Usup Suparma Vini Arifiani Rohmat Volodymyr Rusyn Wahid, Abrar Wahyuni, Kartika Sri Wahyuni, Lidya Tresna Whydiantoro Wildan Rohmanudin Wildan Zhilal Manafi Wiranagari, Relifa G Yofi Awwaluddin Yunus, Riza M ZAPAR SIDIK, SIDIK