The increasing volume of motor vehicles requires automated monitoring for the classification of heavy vehicle categories (Category I–V) based on the number of axles using side-view cameras. This process represents a complex fine-grained visual classification challenge due to the similar body shapes of trucks. To address the dilemma between the need for high accuracy and computational efficiency, this study implements an Adaptive Minimal Ensemble (AME) architecture that adaptively combines small-scale models. The model is evaluated using a confusion matrix along with accuracy, precision, recall, and F1-score metrics. The testing results demonstrate that a single EfficientNetV2-S model is only able to achieve a maximum accuracy of 83% and exhibits significant limitations in extracting crucial distinguishing features, leading to misclassification of Category 4 and 5 vehicles. In contrast, the AME architecture, which utilizes the two best-performing EfficientNetV2-S base models, successfully achieves a substantial performance improvement with 95% accuracy, 95.21% precision, 95% recall, and a 94.99% F1-score. In conclusion, the adaptive layer mechanism in AME is proven to be highly effective in compensating for the individual prediction weaknesses of its base models, resulting in a significantly more precise vehicle classification monitoring system.
Copyrights © 2026