This study evaluates the performance of the EfficientNet architecture for image classification on Fashion-MNIST (70,000 grayscale images, 10 classes). The training/testing split follows the standard 60,000/10,000 scheme, with an internal validation subset drawn from the training data. Preprocessing resizes images to match EfficientNet’s input requirements. The model is trained with the Adam optimizer (learning rate 0.001), batch size 32, for 20 epochs, with data augmentation and metric monitoring. Evaluation on the test set employs accuracy, precision, recall, F1-score, and the confusion matrix. The results show accuracy = 0.9429, precision = 0.9426, recall = 0.9429, and F1-score = 0.9425. Per-class analysis indicates that Trouser and Bag achieve the highest performance, while T-shirt/top and Shirt are most challenging due to visual similarity, as reflected in the confusion matrix. Compared with several baselines standard CNN, CNN-3-128, VGG16, XG-ViT (Vision Transformer), and DRQCNN EfficientNet attains the best overall score, although its advantage is relatively marginal; hence, practical significance depends on application goals.
Copyrights © 2025