JOIV : International Journal on Informatics Visualization
Vol 9, No 2 (2025)

YoloV8, EfficientNetv2, and CSP Darknet Comparison as Recognition Model’s Backbone for Drone-Captured Images

Kridalukmana, Rinta (Unknown)
Eridani, Dania (Unknown)
Septiana, Risma (Unknown)
Windasari, Ike Pertiwi (Unknown)



Article Info

Publish Date
05 Apr 2025

Abstract

Artificial intelligence (AI) has recently empowered drones to support smart city apps and recognize on-the-ground objects or events. Various pre-trained backbones are available to develop object recognition models, and some of them could boost the models’ accuracy. Consequently, it becomes difficult for practitioners to select a suitable backbone as a feature extractor during recognition model development. Hence, this research aims to provide a benchmark examining the performance of three popular backbones in supporting recognition models using images captured by drones as the dataset. This research used the UAV-AUAIR dataset and compared three deep learning backbone architectures as the feature extractor, namely YoloV8_s, EfficientNetv2_s, and CSP_DarkNet_l. The head part of each selected backbone was replaced with YoloV8Detector architecture, provided by Keras-CV, to perform the inference tasks. The models generated during training were evaluated against four measurement methods: loss function, intersection over union (IOU), across-scale mean average precision (mAP), and computational performance. The results showed that the model generated using EfficientNetv2_s backbone outperformed the others in most criteria, except the computational performance and detecting small objects, which was won by YOLOV8_s and CSP_Darknet_l, respectively. Thus, EfficientNetv2_s and CSP_DarkNet_l can be considered if app development concerns accuracy. Meanwhile, YoloV8_s is far better when computational performance is essential, as its prediction time achieved 0.8 seconds per image. This study is essential as a reference for practitioners, particularly those who want to develop an object-recognition model based on a pre-trained backbone.

Copyrights © 2025






Journal Info

Abbrev

joiv

Publisher

Subject

Computer Science & IT

Description

JOIV : International Journal on Informatics Visualization is an international peer-reviewed journal dedicated to interchange for the results of high quality research in all aspect of Computer Science, Computer Engineering, Information Technology and Visualization. The journal publishes state-of-art ...