This study aims to classify AI-generated and real images using Convolutional Neural Network (CNN) architecture by comparing the performance of MobileNetV2 and ResNet50. Previous studies on AI-generated image detection have primarily focused on binary classification without explicitly analyzing object-level context in multi-class scenarios, leaving a gap in understanding model performance across diverse visual categories. The dataset consists of 23,941 images divided into two main classes of real and fake and five subclasses of human, animal, art, view, and vehicle. The training process employs data augmentation and a K-Fold Cross Validation strategy on the training and validation set to maintain balanced class proportions, while a separate unseen test set is used exclusively for final performance evaluation. Model evaluation is performed based on accuracy, precision, recall, and F1-score metrics on test data. The results showed that MobileNetV2 achieved the best accuracy of 89% at the 10th epoch, but experienced a decline in performance at the 30th and 50th epochs, indicating overfitting. In contrast, ResNet50 showed the most stable performance with the highest accuracy of 93% at the 30th epoch and consistently high precision, recall, and F1-score values. Thus, ResNet50 was found to be the most effective architecture for classification of AI-generated and real images on multi-class datasets, while MobileNetV2 remains relevant for implementation on devices with computational limitations.