Pasha, Pancadrya Yashoda
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Text-to-Speech Technology Development Using FastSpeech2 Algorithm for the Story of the Prophet Firdaus, Muhammad Raihan; Firdaus, Muhammad Rihap; Pasha, Pancadrya Yashoda
Khazanah Journal of Religion and Technology Vol. 2 No. 2 (2024): December
Publisher : UIN Sunan Gunung Djati Bandung

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15575/kjrt.v2i2.1099

Abstract

With the SDGs target point 4.6 for 2030, literacy is a very important thing to improve. With today's technological advancements, improving the accessibility of reading in the digital age is becoming increasingly important, especially for individuals with time constraints. Text-to-Speech (TTS) technology allows users to enjoy text content, such as books or journals, in audio format, which can be listened to while doing other activities. This research develops a TTS model based on the FastSpeech2 algorithm, a non-autoregressive deep learning architecture that utilizes Transformers to generate high-quality audio quickly and efficiently. The LJSpeech dataset, which consists of 13,100 audio chunks with a total duration of 24 hours, is used as the training base. The preprocessing process involves text normalization, audio feature extraction, and data synchronization, while evaluation is performed using objective metrics such as Mel Cepstral Distortion (MCD) and Pitch Error to ensure the quality of the results. The results show that FastSpeech2 can provide fast and accurate performance in generating synthesized voices, making it potential to be used in various audio literacy applications. A key application of this TTS technology is in narrating the stories of the Prophets, which are essential in Islamic teachings for imparting moral values, fostering spiritual connection, and offering timeless lessons. The results show that FastSpeech2 is able to produce high-quality audio quickly, making it an effective alternative for improving audio literacy and providing a solution for individuals with limited reading time.