Building of Informatics, Technology and Science
Vol 7 No 3 (2025): December 2025

Comprehensive Benchmark of Yolov11n, SSD MobileNet, CenterFace, Yunet, FastMtCnn, HaarCascade, and LBP for Face Detection in Video Based Driver Drowsiness

Go, Agnestia Agustine Djoenaidi (Unknown)
Alzami, Farrikh (Unknown)
Naufal, Muhammad (Unknown)
Azies, Harun Al (Unknown)
Winarno, Sri (Unknown)
Pramunendar, Ricardus Anggi (Unknown)
Megantara, Rama Aria (Unknown)
Maulana, Isa Iant (Unknown)
Arif, Mohammad (Unknown)



Article Info

Publish Date
16 Dec 2025

Abstract

Face detection is a critical foundation of video-based drowsiness monitoring systems because all downstream tasks such as eye-closure estimation, yawning detection, and head movement analysis depend entirely on correctly identifying the face region. Many previous studies rely on detector-generated outputs as ground truth, which can introduce bias and inflate model performance . To avoid this limitation, I manually constructed a ground truth dataset using 1,229 frames extracted from 129 yawning and microsleep videos in the NITYMED dataset. Ten representative frames were sampled from each video using a face-guided extraction script, and all frames were manually annotated in Roboflow following the COCO format to ensure accurate bounding box labeling under varying lighting, head poses, and facial deformation. Using this manually annotated dataset, I conducted a comprehensive benchmark of seven face-detection algorithms: YOLOv11n, SSD MobileNet, CenterFace, YuNet, FastMtCnn, HaarCascade, and LBP. The evaluation focused on localization quality using Intersection over Union (IoU ≥ 0.5) and Dice Similarity, allowing each algorithm’s predicted bounding box to be directly compared against human defined ground truth. The results show that HaarCascade achieved the highest IoU and Dice scores, particularly in frontal and well-lit frames. FastMtCnn also produced strong alignment with a high number of correctly matched frames. CenterFace and SSD MobileNet demonstrated smooth bounding box fitting with competitive Dice scores, while YOLOv11n and YuNet delivered moderate but stable performance across most samples. LBP showed the weakest results, mainly due to its sensitivity to lighting variations and soft-texture regions. Overall, this benchmark provides an unbiased and comprehensive comparison of modern and classical face-detection algorithms for video-based driver-drowsiness applications.

Copyrights © 2025






Journal Info

Abbrev

bits

Publisher

Subject

Computer Science & IT

Description

Building of Informatics, Technology and Science (BITS) is an open access media in publishing scientific articles that contain the results of research in information technology and computers. Paper that enters this journal will be checked for plagiarism and peer-rewiew first to maintain its quality. ...