Vehicle detection plays a key role in automating traffic analysis, a field that continues to advance rapidly. Vision-based systems identify vehicle types and sizes, but achieving high accuracy and efficiency remains a challenge. Reliable real-world deployment requires optimized models that balance performance and computational cost. YOLOv10n, the most efficient version of the YOLO family, offers a solid foundation for lightweight feature extraction. To improve its detection performance, this study proposes an enhanced version of YOLOv10n by incorporating a scale-aware attention mechanism. We proposed the Expanded Refinement Efficient Multi-Scale Attention (ER-EMA) module, which enhances feature encoding by capturing vehicle characteristics across multiple receptive fields. ER-EMA consists of two core components: the Expanded Converted Inverted Block (ECIB) and the Convolutional Refinement Block (CRB). These components use diverse convolutional kernels to extract and refine multi-frequency spatial features. Integrating ER-EMA into the YOLOv10n framework produces a more compact and accurate detection model. Experimental results show that the proposed model increases mAP@50 by 1%, while reducing the number of parameters by 0.1M and computation by 0.1 GFLOPS on the Vehicle-COCO dataset. On the UA-DETRAC benchmark, it achieves a 4% improvement in mAP@50:95, with a reduction of 0.2M in parameters and 0.4 GFLOPS in computational efficiency—outperforming the original YOLOv10n and prior methods in both performance and computational efficiency.