Accurate people counting in dynamic environments remains challenging due to variations in lighting, complex backgrounds, and occlusion. This study proposes a video-based people counting system leveraging a Convolutional Neural Network (CNN) integrated with the YOLOv5 object detection model. The system applies a structured preprocessing pipeline, including frame extraction, normalization, and noise reduction, to enhance data consistency before detection. The model was evaluated using ten real-world campus video sequences to assess detection reliability and counting accuracy. Experimental results demonstrate that the proposed method achieves high precision and recall for real-time detection across diverse scenarios. Performance degradation was observed in frames containing dense crowds or low illumination, indicating limitations under extreme conditions. These findings validate the feasibility of lightweight CNN-based detectors for surveillance and monitoring applications, while highlighting the need for larger datasets and optimized training strategies to improve robustness in more complex environments.
Copyrights © 2026