Claim Missing Document
Check
Articles

Found 3 Documents
Search

GAN-Based End to End Text-to-Speech System for Indonesian Language Dhiaulhaq, Moch Azhar; Ginanjar, Rizki Rivai; Lestari, Dessi Puji
Jurnal Linguistik Komputasional Vol 5 No 2 (2022): Vol. 5, No. 2
Publisher : Indonesia Association of Computational Linguistics (INACL)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26418/jlk.v5i2.115

Abstract

The developments of the modern text-to-speech (TTS) technology have matured in which the direction of the recent approaches has moved toward the optimization of the system and TTS modeling from the resource-scarce languages, rather than finding new model architectures. In this paper, a novel approach to modeling modern end-to-end (E2E) TTS for Indonesian language with the integration of three different generative adversarial networks (GAN)-based vocoders for comparison is proposed. Based on the evaluation, the proposed system shows promising results with the mean opinion score (MOS) value of 4.60 while still maintaining fast inference speed, proven by the real-time factor (RTF) value under one.
XGBoost and Convolutional Neural Network Classification Models on Pronunciation of Hijaiyah Letters According to Sanad Azis, Aaz Muhammad Hafidz; Lestari, Dessi Puji
JOIN (Jurnal Online Informatika) Vol 8 No 2 (2023)
Publisher : Department of Informatics, UIN Sunan Gunung Djati Bandung

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15575/join.v8i2.1081

Abstract

According to Sanad, the pronunciation of Hijaiyah letters can serve as a benchmark for correct or valid reading based on the makhraj and properties of the letters. However, the limited number of Qur'anic Sanad teachers remains one of the obstacles to learning the Qur'an. This study aims to identify the most practical combination of classification models in constructing a voice recognition system that facilitates learning without requiring direct interaction with a teacher. The methods employed include the XGBoost algorithm and CNN. As a result, out of the 12 letter trait labels, the CNN model was utilized for 10 of them, specifically for traits S1, S2, S4, S5, T1, T2, T3, T4, T5, and T6, on trait labels S3 and T7 applying the XGBoost model. Furthermore, the inclusion of additional data yielded performance results for each property, with an average accuracy of 78.14% for property S (letters with opposing properties), 70.69% for property T (letters without opposing properties), and an overall average of 73.79% per letter.
Comparing Pre-Norm and Post-Norm Transformers in Preserving Gender Information for Indonesian–English Translation through Attention-Based Signal Reinforcement Wijanarko, Andik; Munir, Rinaldi; Khodra, Masayu Leylia; Lestari, Dessi Puji
Journal of Applied Data Sciences Vol 7, No 2: May 2026
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v7i2.1257

Abstract

Gender realization in Indonesian–English machine translation remains challenging due to the absence of grammatical gender in Indonesian, which often leads to unstable or ambiguous gender representations in English outputs. While Transformer-based models have demonstrated strong general translation performance, their ability to preserve gender information across encoding layers remains inconsistent and poorly understood, particularly with respect to architectural normalization strategies.This study presents a comparative analysis of Pre-Norm and Post-Norm Transformer architectures in preserving gender information, and examines the role of attention-based signal reinforcement in mitigating representational degradation. The reinforcement mechanism is introduced prior to standard encoder processing to strengthen gender-relevant token interactions without modifying the overall model structure.Four controlled configurations—Post-Norm, Pre-Norm, Post-Norm with attention-based reinforcement, and Pre-Norm with attention-based reinforcement—are trained under identical random seeds on both unbalanced and balanced datasets. Evaluation is performed on gender-ambiguous test sentences without explicit gender annotations to assess generalization. Gender preservation is assessed at the output level using gender-specific accuracy and BLEU score, and at the representation level using cosine similarity between gender cue embeddings and English gendered pronouns.The results show that Post-Norm Transformers fail to maintain stable gender representations, yielding near-random gender accuracy (~50%) and negligible BLEU scores. Pre-Norm architectures improve training stability but achieve limited gender accuracy (around 30%). Incorporating attention-based signal reinforcement substantially enhances gender preservation, with accuracy rising to over 50% and reaching up to 56% under balanced training conditions, accompanied by a consistent increase in cosine similarity values (exceeding 0.35) between gender cues and corresponding pronouns. These findings indicate that normalization strategy and attention-based reinforcement jointly determine the stability of gender representations in Transformer-based machine translation.