Jurnal Teknik Informatika (JUTIF)
Vol. 7 No. 2 (2026): JUTIF Volume 7, Number 2, April 2026

Optimizing YOLO11 for Dense Crowd Counting under Severe Occlusion via Head-Detection Fine-Tuning

Joko Sutrisno (Faculty of Computer Science, Dian Nuswantoro University, Semarang, Indonesia)
Sri Winarno (Faculty of Computer Science, Dian Nuswantoro University, Semarang, Indonesia)
Affandy Affandy (Faculty of Computer Science, Dian Nuswantoro University, Semarang, Indonesia)



Article Info

Publish Date
18 Apr 2026

Abstract

Accurate and real-time people counting is essential for crowd management and public safety, yet achieving precision in high-density environments remains a challenge due to severe visual occlusion. While the recently released YOLO11 architecture introduces advanced features such as C3k2 and C2PSA modules, its performance as a pre-trained model for people counting tasks has not been fully explored. This study evaluates the efficacy of a head-detection-based fine-tuning strategy using the YOLO11 model, compared against the default pre-trained baseline. The fine-tuning performance is analyzed across three distinct scenarios: S1 (full fine-tuning at 960 pixels), S2 (partial backbone freezing at 960 pixels), and S3 (partial freezing at 640 pixels). The fine-tuning process was conducted using the CC_Mach_1 dataset from Roboflow Universe, which consists of high-density images annotated for head detection. The results demonstrate that the baseline pre-trained YOLO11, which relies on full-body features, exhibits extremely limited performance with an mAP@0.5 of 0.017 and a Mean Absolute Error (MAE) of 100.3. In contrast, the fine-tuned scenarios achieved substantial improvements, led by S1 which reached the highest accuracy with an mAP@0.5 of 0.682 and reduced the MAE by 62% to 37.8. While S2 remained highly competitive with an MAE of 39.6, the performance in S3 declined to 46.9, confirming that lower input resolutions limit the model's ability to identify small-scale head features. These findings provide empirical evidence that domain-specific fine-tuning for head detection substantially improves the robustness of YOLO11 against occlusion. Beyond technical accuracy, this detection-based approach offers a more computationally efficient alternative to traditional density-map-based methods, making it highly suitable for deployment in real-time surveillance systems for large-scale public monitoring.

Copyrights © 2026






Journal Info

Abbrev

jurnal

Publisher

Subject

Computer Science & IT

Description

Jurnal Teknik Informatika (JUTIF) is an Indonesian national journal, publishes high-quality research papers in the broad field of Informatics, Information Systems and Computer Science, which encompasses software engineering, information system development, computer systems, computer network, ...