Patil, Shrinivas A.
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

RNN-driven integration of spatial, temporal, features for Indian sign language recognition and video captioning Pol, Ajay Manohar; Patil, Shrinivas A.
Indonesian Journal of Electrical Engineering and Computer Science Vol 38, No 2: May 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijeecs.v38.i2.pp821-829

Abstract

This paper presents a novel model that integrates spatial features from residual blocks and temporal features from FFT, alongside a sophisticated RNN architecture comprising BiLSTM, gated recurrent units (GRU) layers, and multi-head attention. Achieving nearly 99% accuracy on both WLASL and INCLUDE datasets, this model outperforms standard CNN pretrained models in feature extraction. Notably, the BiLSTM and GRU combination proves superior to other combinations such as LSTM and GRU. The BLEU score analysis further validates the model's efficacy, with scores of 0.51 and 0.54 on the WLASL and INCLUDE datasets, respectively. These results affirm the model's proficiency in capturing intricate spatial and temporal nuances inherent in sign language gestures, enhancing accessibility and communication for the deaf and hard-of-hearing communities. The comparison highlights the superiority of this paper's proposed model over standard approaches, emphasizing the significance of the integrated architecture. Continued refinement and optimization hold promise for further augmenting the model's performance and applicability in real-world scenarios, contributing to inclusive communication environments.