Dessi Puji Lestari, Dessi Puji
Institut Teknologi Bandung

Published : 2 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 2 Documents
Search

GAN-Based End to End Text-to-Speech System for Indonesian Language Dhiaulhaq, Moch Azhar; Ginanjar, Rizki Rivai; Lestari, Dessi Puji
Jurnal Linguistik Komputasional Vol 5 No 2 (2022): Vol. 5, No. 2
Publisher : Indonesia Association of Computational Linguistics (INACL)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.26418/jlk.v5i2.115

Abstract

The developments of the modern text-to-speech (TTS) technology have matured in which the direction of the recent approaches has moved toward the optimization of the system and TTS modeling from the resource-scarce languages, rather than finding new model architectures. In this paper, a novel approach to modeling modern end-to-end (E2E) TTS for Indonesian language with the integration of three different generative adversarial networks (GAN)-based vocoders for comparison is proposed. Based on the evaluation, the proposed system shows promising results with the mean opinion score (MOS) value of 4.60 while still maintaining fast inference speed, proven by the real-time factor (RTF) value under one.
XGBoost and Convolutional Neural Network Classification Models on Pronunciation of Hijaiyah Letters According to Sanad Azis, Aaz Muhammad Hafidz; Lestari, Dessi Puji
JOIN (Jurnal Online Informatika) Vol 8 No 2 (2023)
Publisher : Department of Informatics, UIN Sunan Gunung Djati Bandung

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15575/join.v8i2.1081

Abstract

According to Sanad, the pronunciation of Hijaiyah letters can serve as a benchmark for correct or valid reading based on the makhraj and properties of the letters. However, the limited number of Qur'anic Sanad teachers remains one of the obstacles to learning the Qur'an. This study aims to identify the most practical combination of classification models in constructing a voice recognition system that facilitates learning without requiring direct interaction with a teacher. The methods employed include the XGBoost algorithm and CNN. As a result, out of the 12 letter trait labels, the CNN model was utilized for 10 of them, specifically for traits S1, S2, S4, S5, T1, T2, T3, T4, T5, and T6, on trait labels S3 and T7 applying the XGBoost model. Furthermore, the inclusion of additional data yielded performance results for each property, with an average accuracy of 78.14% for property S (letters with opposing properties), 70.69% for property T (letters without opposing properties), and an overall average of 73.79% per letter.