This Author published in this journals
All Journal Elkawnie
Mohamed Ali
Computer Science and Systems, University of Washington, Tacoma

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Arabic Diacritic-Aware Text-Audio Segmentation and Alignment Model (DASAM) Adel Sabour; Abdeltawab Hendawi; Mohamed Ali
Elkawnie: Journal of Islamic Science and Technology Vol 10, No 1 (2024)
Publisher : Universitas Islam Negeri Ar-Raniry Banda Aceh

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.22373/ekw.v10i1.23637

Abstract

Abstract: This paper introduces the Diacritic-Aware Segmentation and Alignment Model for Arabic (DASAM). Diacritics are vital for pronunciation and meaning in the Arabic language but are often ignored by current speech recognition systems. DASAM is designed for word-level segmentation and alignment in unseen audio and associating them with diacritic-marked Arabic text. The DASAM approach uses linguistic analysis based on intonation rules. DASAM then applies Dynamic Time Warping (DTW) to match the reference audio word with its position in the unseen sentence audio. The model outputs a list of words with their start and end times in the recording. Tested on the Qur’an dataset, DASAM outperforms Google Speech-to-Text (STT) in predicting word timings. It achieves higher accuracy in text-audio alignment, with values of 0.959 and 0.957 for word start and end times, respectively (compared to Google STT’s 0.870 and 0.849). Additionally, DASAM employs advanced signal processing techniques and demonstrates robustness across various audio variations. These results establish that DASAM constitutes a fundamental building block for speech-to-text conversion and linguistic research in Arabic, particularly for applications involving diacritics.