Garuda - Garba Rujukan Digital

Elkawnie

Vol 10, No 1 (2024)

Adel Sabour (Computer Science and Systems, University of Washington, Tacoma)
Abdeltawab Hendawi (Computer Science and Statistics, University of Rhode Island, Rhode Island)
Mohamed Ali (Computer Science and Systems, University of Washington, Tacoma)

Publish Date
21 Jun 2024

Abstract: This paper introduces the Diacritic-Aware Segmentation and Alignment Model for Arabic (DASAM). Diacritics are vital for pronunciation and meaning in the Arabic language but are often ignored by current speech recognition systems. DASAM is designed for word-level segmentation and alignment in unseen audio and associating them with diacritic-marked Arabic text. The DASAM approach uses linguistic analysis based on intonation rules. DASAM then applies Dynamic Time Warping (DTW) to match the reference audio word with its position in the unseen sentence audio. The model outputs a list of words with their start and end times in the recording. Tested on the Qur’an dataset, DASAM outperforms Google Speech-to-Text (STT) in predicting word timings. It achieves higher accuracy in text-audio alignment, with values of 0.959 and 0.957 for word start and end times, respectively (compared to Google STT’s 0.870 and 0.849). Additionally, DASAM employs advanced signal processing techniques and demonstrates robustness across various audio variations. These results establish that DASAM constitutes a fundamental building block for speech-to-text conversion and linguistic research in Arabic, particularly for applications involving diacritics.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

Elkawnie

Website

Abbrev

elkawnie

Publisher

Universitas Islam Negeri Ar-Raniry Banda Aceh

Subject

Biochemistry, Genetics & Molecular Biology Engineering

Description

Elkawnie is a journal of Integration Science and Technology with Islam. It's covering research and technology in the field of study of Architecture, Biology, Chemistry, Environmental Engineering, ICT, Physical Engineering and other science and technology field. In particular, Elkawnie's journal ...

Arabic Diacritic-Aware Text-Audio Segmentation and Alignment Model (DASAM)

Article Info

Abstract