Surface electromyogram (sEMG)-based hand gesture recognition has the potential to improve natural prosthetic control, but its performance often suffers from domain shift (electrode drift, fatigue, cross-day variability) and limited model interpretability. This study proposes an interpretable and adaptive deep learning framework that combines two representation streams (time and time-frequency) with multi-head attention and attribution consistency regularization to generate stable and clinician-auditable relevance maps. Robustness across sessions and subjects is enhanced through self-paced pretraining (temporal contraction), few-shot calibration, and domain alignment (DANN). Uncertainty estimation and calibration (MC-Dropout + temperature scaling) trigger confidence-gated control as a safety safeguard. Evaluation across three scenarios shows within-session accuracy of 96.8% (macro-F1 96.1%), cross-session accuracy of 91.7% (macro-F1 90.5%, ECE ≈ 3.6%), and cross-subject accuracy of 86.4%. Edge optimization (INT8 + structured pruning) reduces inference latency from 92 ms (FP32) to ~52–66 ms with only a ~2% accuracy reduction, and power consumption is ~5–6 W, meeting real-time response requirements. Reliability diagrams confirm the calibrated probabilities, while ablation analysis demonstrates significant contributions of attention, loss attribution, and DANN to performance and stability. Overall, this framework bridges the lab-clinic gap by providing an accurate, explainable, adaptive, efficient, and safe solution for real-time hand prosthesis control, while opening up research directions for longitudinal and continuous control based on user co-adaptation.
Copyrights © 2025