Nguyen, Tuan-Linh
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Multi-task deep learning for Vietnamese capitalization and punctuation recognition Nguyen, Phuong-Nhung; Thu-Hien, Nguyen; Nguyen, Truong-Thang; Thu-Nga, Nguyen Thi; Anh-Phuong, Nguyen Thi; Nguyen, Tuan-Linh
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 14, No 2: April 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v14.i2.pp1605-1615

Abstract

Speech recognition is the process of converting the speech signal of a particular language into a sequence of corresponding content words in text format. The output text of automatic speech recognition (ASR) systems often lacks struc- ture, such as punctuation, capitalization of the first letter of a sentence, proper nouns, and names of locations. This absence of structure complicates compre- hension and restricts the utility of ASR-generated text in various applications, such as creating movie subtitles, generating transcripts for online meetings, and extracting customer information. Therefore, developing standardization solu- tions for the output text from ASR is necessary to improve the overall quality of ASR systems. In this article, we use the idea of multitask deep learning for the task of capitalization and punctuation recognition (CPR) for the output text of Vietnamese ASR, with the aim of the named entity recognition (NER) task as a supplement to help the CPR model perform better, and proposed to use text-to- speech (TTS) to create a dataset for CPR-NER multitask model training. The experiment results show that the multi-task deep learning model has improved CPR results by 6.2% of F1 score with ASR output and 7.1% on raw text.