This research aims to evaluate and compare the performance of three Convolutional Neural Network (CNN) architectures, namely VGG16, Xception, and NASNet Mobile, in detecting colors on objects. The main problem in this research is to determine the architecture with the most effective and efficient combination of hyperparameters to detect colors on objects. The research process includes problem identification, object color dataset collection, image preprocessing, training of three CNN models (VGG16, Xception, and NASNet Mobile), and performance evaluation using accuracy, precision, recall, and f1-score metrics. In addition, a comparative analysis of the performance of each model based on the combination of hyperparameters used, such as optimizer, batch size, and learning rate. The analysis also includes evaluating computational efficiency by measuring the training time and prediction time of each model, as well as examining the relationship between architectural complexity and classification performance. The results of the analysis are used to determine the most optimal model that is feasible to implement in an object color detection system. The test results show that NASNet Mobile provides the best performance with an accuracy of 88% and a prediction time of 2 minutes 22 seconds for 2904 images. The Xception model produced an accuracy of 86% with a prediction time of 4 minutes 22 seconds, while VGG16 recorded an accuracy of 90% with a prediction time of 10 minutes 9 seconds.