Waste has become a serious environmental issue that requires effective and efficient management systems. This study compares three residual network (ResNet) variants (ResNet-34, ResNet-50, and ResNet-101) within the single shot detector (SSD) framework for visual waste detection. The dataset consists of 800 images in four categories—food, plastic, paper, and wood—with a 70:20:10 split for training, validation, and testing. The backbone architecture, optimizer (stochastic gradient descent (SGD) and Adam), and learning rate are varied to evaluate fifteen experimental configurations. Model performance is assessed using precision, recall, F1-score, and mean average precision (mAP). The results show that SSD–ResNet-34 with SGD and a learning rate of 0.0005 works best, with a mAP of 91.02%, which is better than deeper backbones. Deeper backbone architectures do not consistently improve accuracy; instead, they increase the risk of overfitting on small datasets. These findings highlight that lightweight architecture, when used with the right hyperparameter settings, strikes a better balance between accuracy, computational efficiency, and generalization for small-scale waste detection tasks.
Copyrights © 2026