Claim Missing Document
Check
Articles

Found 1 Documents
Search

Multimodal AI Framework for Sign Language Recognition and Medical Informatics in Hearing-Impaired Patients Nuankaew, Pratya; Khamthep, Parin; Jaitem, Patdanai; Nuankaew, Kuljira S.; Nuankaew, Kaewpanya S.; Nuankaew, Wongpanya S.
Journal of Applied Data Sciences Vol 7, No 1: January 2026
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v7i1.1096

Abstract

This study assesses the feasibility of YOLO-based detectors for the recognition of Thai Sign Language (TSL) within clinical intake workflows. We benchmark YOLOv5 through YOLOv10 over 100 to 150 training epochs and evaluate metrics including Precision, Recall, mAP@50, mAP@50:95, alongside training and validation losses to gauge stability. The losses decrease steadily as detection metrics improve; YOLOv10 offers the optimal balance, with Precision at 0.953, Recall at 0.939, mAP@50 at 0.933, and mAP@50:95 at 0.492. The improvements observed at stricter IoU thresholds are modest, underscoring ongoing challenges in achieving accurate localization and generalization across varying lighting conditions, viewpoints, occlusions, and motion. YOLOv11 has been excluded from the primary results due to abnormal loss behavior. These findings endorse a multimodal pipeline that employs an image-based detector as the central perception component, supplemented with pose and key point cues, as well as OCR and NLP layers, to transform recognized signs into structured medical intents for triage and telemedicine applications. Future research will focus on expanding sequence-level evaluation, incorporating dialects and co-articulation in TSL, and developing compressed or distilled models to facilitate reliable on-device inference in resource-constrained environments.