Artificial intelligence based automated assessment systems have developed rapidly over the past decade; however, studies that integrate technological, pedagogical, socio-cultural, and ethical dimensions simultaneously remain limited. This study presents a multi-paradigmatic analysis of 24 articles published in Scopus Q1 and Q2 indexed journals between 2020 and 2025. Three research questions are posed: the types and capacities of emerging automated assessment systems, their pedagogical and ethical implications, and the evaluative framework required. The analysis identifies six categories of automated assessment systems, with a dominant shift toward large-scale language models in recent studies. The findings indicate that technical superiority does not necessarily guarantee fairness or pedagogical validity. Three fundamental ethical issues are consistently identified: linguistic discrimination, lack of system explainability, and the indispensable need for human oversight. In response, this study introduces the TAPE-H Framework (Technology, Assessment Theory, Pedagogy, Ethics, Human Oversight) as an integrative evaluative model that assesses automated assessment systems holistically, moving beyond accuracy based metrics alone.
Copyrights © 2026