Traffic congestion caused by temporary road repairs often forces bidirectional traffic to alternate through a single lane, leading to increased delays and imbalanced traffic flow. This study investigates the impact of reward scheme design on the performance of a Deep Q-Network (DQN)-based adaptive traffic signal control system in such constrained environments. Using the Simulation of Urban Mobility (SUMO), a traffic scenario involving 1,656 vehicles over 1,800 seconds was modeled to evaluate six reward scheme configurations combining Traffic Flow (TF), Waiting Time (WT), and Average Speed (AS): TF-TF, TF-WT, WT-TF, WT-WT, AS-TF, and AS-WT. The DQN agent, implemented with a two-layer neural network and trained for 50 epochs, dynamically adjusted signal timing to balance traffic from opposing directions. Experimental results indicate that the AS-WT configuration achieved the most balanced performance, producing the best fairness index (1.04) while maintaining stable traffic flow. In contrast, schemes with misaligned or redundant metrics showed significantly poorer performance. These findings highlight the importance of reward design in reinforcement learning-based traffic signal control and suggest that carefully selected reward schemes can improve fairness and efficiency in temporary road repair zones.
Copyrights © 2026