This study investigates the influence of Madurese phonetic interference on the accuracy of AI-based Arabic Speech Recognition systems. The research focuses on vowel-shift deviations produced by Madurese native speakers in Arabic pronunciation and evaluates the performance of Google Speech-to-Text and OpenAI Whisper in recognizing dialect-influenced Arabic speech. Using a mixed-method approach, speech data were collected from 13 sixth-semester students of the Arabic Language Education Department at UIN Madura through direct voice recording involving isolated words, continuous Arabic sentences, and Qur’anic recitations. Acoustic observation using Praat identified systematic vowel shifts, particularly the transitions of /a/ to [e], /i/ to [e], and /u/ to [o], which generated acoustic variability in Arabic speech production. The findings reveal that OpenAI Whisper achieved higher recognition accuracy (84.0%) with a lower Word Error Rate (16.0%) compared to Google Speech-to-Text, which obtained an accuracy of 61.8% and a WER of 38.2%. The dominant errors included substitution, deletion, insertion, and segmentation errors, indicating instability in phoneme mapping caused by dialect-induced vowel transitions. The study concludes that current AI-based Arabic Speech Recognition systems remain sensitive to low-resource dialect interference and that acoustic variability significantly affects recognition stability in multilingual speech environments. These findings highlight the importance of developing more adaptive and linguistically inclusive ASR models for Arabic language learning contexts.