Chang, Siu-Hong
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Enhanced speech recognition in natural language processing Chang, Siu-Hong; Ng, Kok-Why; Haw, Su-Cheng; Yoong, Yih-Jian
Bulletin of Electrical Engineering and Informatics Vol 14, No 6: December 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v14i6.9539

Abstract

Speech recognition is crucial for helping individuals with physical disabilities access digital content. However, current systems have significant flaws that hinder user experience and complicate daily tasks. Environmental disturbances can cause misinterpretation, and existing automatic speech recognition (ASR) systems struggle with comprehending acoustic and linguistic nuances and handling diverse speaking styles and accents. To address these issues, a new model integrates bidirectional encoder representations from transformers (BERT) and transformer features with natural language processing (NLP) capabilities. This model aims to consolidate semantic, linguistic, and acoustic information extracted from the Kaldi speech recognition toolkit and improve accuracy by rescoring the list of N-best hypotheses. The innovative approach leverages advancements in NLP to enhance speech recognition's accuracy and robustness across various scenarios. Evaluations on the LibriSpeech dataset show that integrating BERT, transformer encoder, and generative pretrained transformer 2 for rescoring N-best hypotheses significantly improves transcription accuracy. The proposed model achieves a word error rate (WER) of 17.98%, outperforming other models. This development paves the way for advancements in speech recognition technology, offering better user experiences in real-world applications.