Claim Missing Document
Check
Articles

Found 1 Documents
Search

Comparison Of Feature Extraction Techniques For Long Short-Term Memory Models In Indonesian Automatic Speech Recognition Armaisya, Dimas Dwi; Pamungkasari, Panca Dewi; Rifai, Achmad Pratama; Sholihati, Ira Diana; Gopal Sakarkar
Green Intelligent Systems and Applications Volume 5 - Issue 1 - 2025
Publisher : Tecno Scientifica Publishing

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.53623/gisa.v5i1.605

Abstract

Automatic Speech Recognition (ASR) faced challenges in accuracy and noise robustness, particularly in Bahasa Indonesia. This research addressed the limitations of single feature extraction methods, such as Mel-Frequency Cepstral Coefficients (MFCC), which were sensitive to noise, and Relative Spectral Transform - Perceptual Linear Predictive (RASTA-PLP), which was less effective in frequency representation, by proposing a hybrid approach that combined both techniques using Long Short-Term Memory (LSTM) models. MFCC enhanced spectral accuracy, while RASTA-PLP improved noise robustness, resulting in a more adaptive and informative acoustic representation. The evaluation demonstrated that the hybrid method outperformed single and non-extraction approaches, achieving a Character Error Rate (CER) of 0.5245 on clean data and 0.8811 on noisy data, as well as a Word Error Rate (WER) of 0.9229 on clean data and 1.0015 on noisy data. Although the hybrid approach required longer training times and higher memory usage, it remained stable and effective in reducing transcription errors. These findings suggested that the hybrid method was an optimal solution for Indonesian speech recognition in various acoustic conditions.