This study investigates how audio-only and multimodal (audio plus visual) techniques affect speech perception among Indonesian EFL learners. Using a quasi-experimental design, 60 third-semester English literature students were divided into two groups that received either an audio recording or an audio-visual video of the same narrative, “The Little Red Hen,” followed by a 20-item speech perception test and questionnaires on emotional engagement and learning satisfaction. Results show that the multimodal group achieved significantly higher comprehension scores than the audio-only group, with a large effect size indicating a substantial advantage of visual cues such as facial expressions and gestures in supporting listening. Correlation analyses also revealed significant positive relationships between emotional engagement, learning satisfaction, and speech perception in both conditions, with stronger coefficients for the multimodal group. These findings suggest that multimodal input not only improves comprehension by reducing cognitive load and enriching contextual information but also enhances affective factors that are crucial for successful language learning. The study recommends that EFL educators incorporate multimodal materials to optimize listening instruction and calls for further research on the long-term impact of different visual cue types in varied learning contexts.
Copyrights © 2025