Violence detection in CCTV footage remains a critical challenge for public safety, necessitating automated solutions to overcome human monitoring limitations. This study proposes an LSTM-based framework to improve detection accuracy by analyzing temporal patterns in surveillance videos. Using a dataset of 2,000 videos (1,000 violent/1,000 non-violent), the model extracts spatial-temporal features via optical flow and achieves 93% training accuracy and 91% test accuracy, with a precision of 92% and AUC of 0.94. Results demonstrate significant improvements over traditional methods, particularly in dynamic scenarios, though performance dips for occluded actions or weapon-related violence. The discussion highlights the model’s real-time applicability, computational efficiency (120 ms latency per segment), and alignment with smart city surveillance needs. Limitations include dataset diversity and environmental variability, suggesting future directions in multi-modal data fusion and edge computing. This research advances AI-powered security systems, offering a robust tool for proactive threat detection while underscoring the need for scalable, context-aware solutions.
Copyrights © 2025