This study aimed to develop and validate an AI-Enhanced Multimodal Authentic Assessment model for evaluating EFL students' speaking skills, addressing the limitations of conventional, subjective methods. Employing a mixed-methods approach with a qualitative-dominant design, the research involved 35 university students from a Communication and Language study program. Data were collected through authentic video-based speaking tasks, AI-assisted linguistic analysis (using Google Speech-to-Text and a ChatGPT-based evaluator), detailed multimodal rubric assessments, and student perception questionnaires. Data analysis was conducted through three procedures: multimodal performance analysis using a validated rubric, comparative analysis of AI-generated linguistic metrics, and thematic analysis of questionnaire responses. Quantitative data from AI metrics and Likert-scale questionnaire items were analyzed using descriptive statistics, while qualitative data were analyzed thematically. The findings revealed that multimodal assessment effectively captured verbal, prosodic, visual, and gestural aspects of performance. Concurrently, AI excelled at objectively analyzing micro-linguistic features such as pronunciation, speech rate, and vocabulary. The integration of human and AI evaluation created a comprehensive hybrid model that provided richer, more informative feedback. Furthermore, students expressed positive perceptions regarding the clarity and usefulness of the AI-generated feedback. The study concludes that this integrated assessment model is highly relevant for 21st-century pedagogy and enhances the accuracy and quality of oral performance evaluation. The implications suggest that educators can adopt this framework to create more objective, efficient, and holistic speaking assessments, ultimately fostering better learning outcomes.
Copyrights © 2026