Object detection plays a crucial role in traffic surveillance, particularly in urban environments characterized by high vehicle density, diverse weather conditions, and limited computational resources. Although YOLOv9 and DETR have demonstrated strong performance in general object detection tasks, there is a lack of comparative research evaluating their effectiveness under specific challenges of traffic surveillance. These challenges include the need for real-time processing, accurate detection of small or partially occluded objects, and adaptability to complex traffic scenarios. This study addresses this gap by conducting a comparative evaluation of YOLOv9 and DETR using a custom traffic image dataset, with training iterations varied from 10 to 50 epochs to observe performance development. Evaluation metrics included mean average precision, precision, recall, F1-score, inference time, and object count per image. The results indicated that DETR achieved the highest accuracy across all metrics at the final training stage and detected up to 22 objects per image. However, the average inference time exceeded seven seconds per image, limiting the real-time applicability. Conversely, YOLOv9 achieved competitive accuracy with a significantly faster inference time of approximately 0.43 seconds per image. These findings provide practical insights into the trade-off between accuracy and processing efficiency, and offer guidance for model selection in operational traffic surveillance systems.