Video classification is essential in computer vision, enabling automated understanding of dynamic content in applications such as surveillance, autonomous systems, and content recommendation. Traditional long-term recurrent convolutional network (LRCN) models, however, often struggle to capture complex spatio-temporal patterns, limiting classification performance across diverse video datasets. To address this limitation, we propose an enhanced LRCN with architectural refinements, optimized filter sizes, and hyperparameter tuning, improving both temporal modeling and spatial feature extraction. Experimental results on three benchmark datasets DynTex, UCF11, and UCF50 demonstrate that the proposed model achieves accuracies of 0.90 on DynTex (+26.8% over standard LRCN), 0.92 on UCF11 (+19.5%), and 0.94 on UCF50 (+1.1%), consistently outperforming ConvLSTM, LRCN, and other state-of-the-art approaches. These findings indicate that the enhanced LRCN effectively captures spatial and temporal dynamics in video sequences, setting a new benchmark for video classification. The study highlights the impact of architectural innovation and parameter optimization, providing a solid foundation for future research on scalable and efficient deep learning models for dynamic content analysis.
Copyrights © 2026