Jurnal Teknik Informatika (JUTIF)
Vol. 6 No. 5 (2025): JUTIF Volume 6, Number 5, Oktober 2025

Comparison of IndoNanoT5 and IndoGPT for Advancing Indonesian Text Formalization in Low-Resource Settings

Firdausillah, Fahri (Unknown)
Luthfiarta, Ardytha (Unknown)
Nugraha, Adhitya (Unknown)
Dewi, Ika Novita (Unknown)
Hafiizhudin, Lutfi Azis (Unknown)
Mumtaz, Najma Amira (Unknown)
Syarifah, Ulima Muna (Unknown)



Article Info

Publish Date
16 Oct 2025

Abstract

The rapid growth of digital communication in Indonesia has led to a distinct informal linguistic style that poses significant challenges for Natural Language Processing (NLP) systems trained on formal text. This discrepancy often degrades the performance of downstream tasks like machine translation and sentiment analysis. This study aims to provide the first systematic comparison of IndoNanoT5 (encoder-decoder) and IndoGPT (decoder-only) architectures for Indonesian informal-to-formal text style transfer. We conduct comprehensive experiments using the STIF-INDONESIA dataset through rigorous hyperparameter optimization, multiple evaluation metrics, and statistical significance testing. The results demonstrate clear superiority of the encoder-decoder architecture, with IndoNanoT5-base achieving a peak BLEU score of 55.99, significantly outperforming IndoGPT's highest score of 51.13 by 4.86 points—a statistically significant improvement (p<0.001) with large effect size (Cohen's d = 0.847). This establishes new performance benchmarks with 28.49 BLEU points improvement over previous methods, representing a 103.6% relative gain. Architectural analysis reveals that bidirectional context processing, explicit input-output separation, and cross-attention mechanisms provide critical advantages for handling Indonesian morphological complexity. Computational efficiency analysis shows important trade-offs between inference speed and output quality. This research advances Indonesian text normalization capabilities and provides empirical evidence for architectural selection in sequence-to-sequence tasks for morphologically rich, low-resource languages.

Copyrights © 2025






Journal Info

Abbrev

jurnal

Publisher

Subject

Computer Science & IT

Description

Jurnal Teknik Informatika (JUTIF) is an Indonesian national journal, publishes high-quality research papers in the broad field of Informatics, Information Systems and Computer Science, which encompasses software engineering, information system development, computer systems, computer network, ...