Garuda - Garba Rujukan Digital

Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer

Vol 5 No 12 (2021): Desember 2021

Bayu Aji Firmansyah (Fakultas Ilmu Komputer, Universitas Brawijaya)
Fitra Abdurrachman Bachtiar (Fakultas Ilmu Komputer, Universitas Brawijaya)

Publish Date
21 Oct 2021

Speech recognition research with end-to-end model-based bidirectional RNN approach still has the constraint on high latency. speech recognition model which was built using end-to-end models still have some problems at error spelling. Also, speech recognition model is very sensitive to dialecta and types of equipment recorder speakers. This work will examine the behavior of the network in studying the acoustic features based on gradient and loss with the use of unidirectional base GRU CTC which has lower cost of computation compared to the base bidirectional RNN CTC. This study did not use the language model in helping to model the acoustics in the mapping of the acoustic signal. Using the data in the audio translation of the Quran in the dialect and Bahasa Indonesia, the data extracted using the MFCC to obtain acoustic features. Batch Normalization is also used on the GRU network to avoid covariate-shift between the layers of the network. The network consists of three-layers network MLP with activation function ReLU and forwarded with a layer of unidirectional GRU. After passing through the GRU, the data will be processed on the SLP with the function of the activation of softmax where the results will be input on the CTC. The network is optimized using Adam optimizer and generate 90.611 % WER of the best model tested. The network has a vanishing gradient and results in the slow process of learning the network in recognizing the acoustic signal. The use of unidirectional GRU base also has no big significance in the delay layer to expose the temporal information.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer

Website

Abbrev

j-ptiik

Publisher

Universitas Brawijaya

Subject

Computer Science & IT Control & Systems Engineering Education Electrical & Electronics Engineering Engineering

Description

Jurnal Pengembangan Teknlogi Informasi dan Ilmu Komputer (J-PTIIK) Universitas Brawijaya merupakan jurnal keilmuan dibidang komputer yang memuat tulisan ilmiah hasil dari penelitian mahasiswa-mahasiswa Fakultas Ilmu Komputer Universitas Brawijaya. Jurnal ini diharapkan dapat mengembangkan penelitian ...

Article Info

Abstract

Automatic Speech Recognition Bahasa Indonesia menggunakan Unidirectional Gated Recurrent Unit

Article Info

Abstract