Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : Jurnal Teknik Informatika (JUTIF)

Comparison of IndoNanoT5 and IndoGPT for Advancing Indonesian Text Formalization in Low-Resource Settings Firdausillah, Fahri; Luthfiarta, Ardytha; Nugraha, Adhitya; Dewi, Ika Novita; Hafiizhudin, Lutfi Azis; Mumtaz, Najma Amira; Syarifah, Ulima Muna
Jurnal Teknik Informatika (Jutif) Vol. 6 No. 5 (2025): JUTIF Volume 6, Number 5, Oktober 2025
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2025.6.5.4935

Abstract

The rapid growth of digital communication in Indonesia has led to a distinct informal linguistic style that poses significant challenges for Natural Language Processing (NLP) systems trained on formal text. This discrepancy often degrades the performance of downstream tasks like machine translation and sentiment analysis. This study aims to provide the first systematic comparison of IndoNanoT5 (encoder-decoder) and IndoGPT (decoder-only) architectures for Indonesian informal-to-formal text style transfer. We conduct comprehensive experiments using the STIF-INDONESIA dataset through rigorous hyperparameter optimization, multiple evaluation metrics, and statistical significance testing. The results demonstrate clear superiority of the encoder-decoder architecture, with IndoNanoT5-base achieving a peak BLEU score of 55.99, significantly outperforming IndoGPT's highest score of 51.13 by 4.86 points—a statistically significant improvement (p<0.001) with large effect size (Cohen's d = 0.847). This establishes new performance benchmarks with 28.49 BLEU points improvement over previous methods, representing a 103.6% relative gain. Architectural analysis reveals that bidirectional context processing, explicit input-output separation, and cross-attention mechanisms provide critical advantages for handling Indonesian morphological complexity. Computational efficiency analysis shows important trade-offs between inference speed and output quality. This research advances Indonesian text normalization capabilities and provides empirical evidence for architectural selection in sequence-to-sequence tasks for morphologically rich, low-resource languages.
Co-Authors Abas Setiawan Abdul Syukur Abdul Syukur Abu Salam Adhitya Nugraha Adriani, Mira Riezky Agung Priyo Utomo, Rino Agustin, Kristina Alzami, Farrikh Ardytha Luthfiarta Arifin, Muhammad Farhan Arry Maulana Syarif, Arry Maulana Arunia, Aurelya Prameswari Asih Rohmani, Asih Atha Rohmatullah, Fawwaz Ayuningsih, Dewi Putri Azhari Azhari Bramantyo, Satrio Bisma Candra Irawan Catur Supriyanto Darnell Ignasius Diana Aqmala Dwi Puji Prabowo, Dwi Puji Dzaki, Azmi Abiyyu Egia Rosi Subhiyakto, Egia Rosi Erika Devi Udayanti Erwin Yudi Hidayat Erwin Yudi Hidayat Fahri Firdausillah Fajar Agung Nugroho Fitriyani, Shelomita Hafiizhudin, Lutfi Azis Handayani, Sri Haresta, Alif Agsakli Hasan Asari Heribertus Himawan Ifan Rizqa Indrayani, Heni Irawan, Enrico Irvan Muzakkir Irvan Muzakkir Isworo, Slamet Junta Zeniarja Khafiizh Hastuti Khariroh, Shofiyatul Kurniawan, Defri Laurent, Feby Lisa Mardiana Marjuni, Aris Megantara, Rama Aria Muljono Muljono Mumtaz, Najma Amira MY. Teguh Sulistyono Norman, Maria Bernadette Chayeenee Octaviani, Dhita Aulia Priyo Utomo, Rino Agung Puri Sulistiyawati Pusung, Elvanro Marthen Ramadhan Rakhmat Sani Reza, Ivan Muhammad Rhyan David Levandra Ricardus Anggi Pramunendar Rifamuthia, Titis Ritzkal, Ritzkal Safira, Almira Zuhrotus Salsabilla, Annisa Ratna Saputra, Filmada Ocky Sholikun, Sholikun Sindhu Rakasiwi Sri Winarno Subowo, Moh Hadi Sulistyono, Teguh Suyatno, Revalina Syarifah, Ulima Muna Utomo, Danang Wahyu Wellia Shinta Sari Wibowo, Isro' Rizky Yanuaresta, Dianna Zainal Arifin Hasibuan