Emerging Science Journal
Vol 5, No 1 (2021): February

Human Action Recognition in Videos using Convolution Long Short-Term Memory Network with Spatio-Temporal Networks

Ashok Sarabu (SITE School, VIT University, Vellore -632014, TamilNadu,)
Ajit Kumar Santra (SITE School, VIT University, Vellore -632014, TamilNadu,)



Article Info

Publish Date
01 Feb 2021

Abstract

Two-stream convolutional networks plays an essential role as a powerful feature extractor in human action recognition in videos. Recent studies have shown the importance of two-stream Convolutional Neural Networks (CNN) to recognize human action recognition. Recurrent Neural Networks (RNN) has achieved the best performance in video activity recognition combining CNN. Encouraged by CNN's results with RNN, we present a two-stream network with two CNNs and Convolution Long-Short Term Memory (CLSTM). First, we extricate Spatio-temporal features using two CNNs using pre-trained ImageNet models. Second, the results of two CNNs from step one are combined and fed as input to the CLSTM to get the overall classification score. We also explored the various fusion function performance that combines two CNNs and the effects of feature mapping at different layers. And, conclude the best fusion function along with layer number. To avoid the problem of overfitting, we adopt the data augmentation techniques. Our proposed model demonstrates a substantial improvement compared to the current two-stream methods on the benchmark datasets with 70.4% on HMDB-51 and 95.4% on UCF-101 using the pre-trained ImageNet model. Doi: 10.28991/esj-2021-01254 Full Text: PDF

Copyrights © 2021






Journal Info

Abbrev

ESJ

Publisher

Subject

Environmental Science

Description

Emerging Science Journal is not limited to a specific aspect of science and engineering but is instead devoted to a wide range of subfields in the engineering and sciences. While it encourages a broad spectrum of contribution in the engineering and sciences. Articles of interdisciplinary nature are ...