Winarno , Sri
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Optimizing YOLO11 for Dense Crowd Counting under Severe Occlusion via Head-Detection Fine-Tuning Sutrisno, Joko; Winarno , Sri; Affandy, Affandy
Jurnal Teknik Informatika (Jutif) Vol. 7 No. 2 (2026): JUTIF Volume 7, Number 2, April 2026
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2026.7.2.5699

Abstract

Accurate and real-time people counting is essential for crowd management and public safety, yet achieving precision in high-density environments remains a challenge due to severe visual occlusion. While the recently released YOLO11 architecture introduces advanced features such as C3k2 and C2PSA modules, its performance as a pre-trained model for people counting tasks has not been fully explored. This study evaluates the efficacy of a head-detection-based fine-tuning strategy using the YOLO11 model, compared against the default pre-trained baseline. The fine-tuning performance is analyzed across three distinct scenarios: S1 (full fine-tuning at 960 pixels), S2 (partial backbone freezing at 960 pixels), and S3 (partial freezing at 640 pixels). The fine-tuning process was conducted using the CC_Mach_1 dataset from Roboflow Universe, which consists of high-density images annotated for head detection. The results demonstrate that the baseline pre-trained YOLO11, which relies on full-body features, exhibits extremely limited performance with an mAP@0.5 of 0.017 and a Mean Absolute Error (MAE) of 100.3. In contrast, the fine-tuned scenarios achieved substantial improvements, led by S1 which reached the highest accuracy with an mAP@0.5 of 0.682 and reduced the MAE by 62% to 37.8. While S2 remained highly competitive with an MAE of 39.6, the performance in S3 declined to 46.9, confirming that lower input resolutions limit the model's ability to identify small-scale head features. These findings provide empirical evidence that domain-specific fine-tuning for head detection substantially improves the robustness of YOLO11 against occlusion. Beyond technical accuracy, this detection-based approach offers a more computationally efficient alternative to traditional density-map-based methods, making it highly suitable for deployment in real-time surveillance systems for large-scale public monitoring.