Object identification is one of the major application areas of deep learning that provides significantly better feature extraction and representation than more conventional methods of recognition. Driven by the growing significance of conjunction of objects detection and color interpretation in contemporary computer vision systems, the current work proposes an integrated, real-time deep learning system that completes the task of object localization and color analysis. It is suggested that the proposed system employs a faster region-based convolutional neural network (Faster R-CNN) with backbone of ResNet-50 and supplemented with a feature pyramid network to perform multi-scale feature aggregation. The model was trained and tested using the Pascal VOC 2012 dataset and it showed good results with the average precision of 0.8114, F1 of 0.6232 and IoU of 0.7096. The large set of experiments on different learning rates and training epochs allowed optimizing the detector to work well in a variety of conditions. To enhance even more, visualization histogram of oriented gradients (HOG) and gradient-weighted class activation mapping (Grad-CAM) was used to gain a more profound understanding of the significance of features and the logic behind a model. This study complements image perception with color by combining object recognition and color in a single architecture, which can result in fruitful applications in areas of autonomous vehicles, industrial automation, and medical imaging.
Copyrights © 2026