This study evaluates the Automated Flowchart Assessment Tool (AFAT) to overcome limitations in semantic sensitivity and layout robustness prevalent in existing tools. Through a quantitative analysis of 312 student submissions, AFAT demonstrated superior diagnostic performance with a Micro-F1 score of 0.92 and substantial inter-rater agreement (Fleiss' Kappa = 0.88), supporting the hypothesis of expert-level accuracy. Key findings reveal that AFAT significantly enhances operational efficiency, reducing evaluation time by 61.2% (averaging 1.87 minutes per flowchart) while decreasing inter-rater variability by 28%. Generalized Linear Model (GLM) analysis confirmed significant time savings, particularly in high-complexity sessions (Wald χ² = 87.44, p < 0.001). Beyond technical efficiency, this research contributes to applied science education by providing a scalable framework for computational science literacy, enabling the rigorous assessment of algorithmic thinking within integrated STEM curricula. These results substantiate AFAT’s potential for large-scale deployment as a robust tool for automated scoring in formal educational settings
Copyrights © 2025