Background: First-semester students in the Arabic Language Education program often struggle with pronouncing complex Arabic phonemes in the al-Ashwat wa al-Lahjat course. Aims: This study aims to identify the phonemes that pose pronunciation difficulties, examine the effectiveness of song-based media in enhancing students' phonetic accuracy, and explore students' perceptions of using songs in learning. Methods: A qualitative case study approach was employed, involving 35 first-semester students from Class C of the Arabic Language Education Program at UIN Sunan Kalijaga Yogyakarta. Data were collected through participant observation and focus group discussions (FGDs). Results: The findings revealed that integrating song media, including hijaiyyah letter songs to identify problematic phonemes and Asmaul Husna songs to facilitate pronunciation practice, significantly enhances students' phonetic articulation. The most challenging phonemes identified were ع ('ain), ص (shad), ض (dhad), غ (ghin), ظ (dhadz), خ (kha), ث (tsa), ح (ha), ق (qaf), ش (shin), and ط (tha). Furthermore, song-based learning fosters an engaging and enjoyable learning environment, enhances motivation, and promotes collaborative learning. Implications: These findings indicate that song media can be an effective pedagogical tool for enhancing Arabic phonetic instruction by improving students’ pronunciation accuracy, fostering engagement, and creating a more interactive and enjoyable learning environment.