Tan, Ben Liu
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Efficient Temporal Segmentation And Classification Of Short-Form Video Content Using Lightweight CNN-LSTM Architecture Tan, Ben Liu; Liem, Chstina Angel; Amen, Mohamed
Journal of Technology Informatics and Engineering Vol. 5 No. 1 (2026): APRIL | JTIE : Journal of Technology Informatics and Engineering
Publisher : University of Science and Computer Technology

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.51903/jtie.v5i1.441

Abstract

The exponential rise of short-form video platforms such as TikTok, Instagram Reels, and YouTube Shorts has transformed digital content consumption patterns, creating both opportunities and challenges in media analysis. One critical need is the efficient segmentation and classification of temporal segments within these videos to enable applications in content moderation, targeted advertising, and audience behavior research. This study proposes a lightweight deep learning architecture that integrates Convolutional Neural Networks (CNN) for visual feature extraction and Long Short-Term Memory (LSTM) networks for temporal sequence modeling. The proposed CNN-LSTM framework is optimized for computational efficiency while maintaining high classification accuracy, making it suitable for deployment in resource-constrained environments. Experimental evaluations on a curated short-form video dataset show that the model achieves competitive performance compared with larger architectures, with significant reductions in memory usage and inference time. Furthermore, the temporal segmentation module effectively isolates meaningful visual-audio segments, enabling more precise classification outcomes. The results highlight the potential of lightweight architectures to address the scalability demands of modern video analysis systems without sacrificing accuracy. This research contributes to the growing discourse on efficient multimedia processing by bridging the gap between high-performance models and practical, real-time applications in the evolving short-form video ecosystem.