Cataracts are the leading cause of blindness worldwide, with 94 million cases reported in 2023. Conventional cataract identification relies on visual examination methods that are prone to error due to their subjective nature. This study compares the performance of two Convolutional Neural Network (CNN) architectures, MobileNetV2 and EfficientNet-B4, in detecting cataract images. The dataset used was sourced from Kaggle and consisted of 1,074 normal images and 1,038 cataract images. The stages included preprocessing, augmentation, and the application of transfer learning with weights from ImageNet. The models were evaluated using accuracy, loss, precision, recall, F1-score, error rate, and visual interpretation using Grad-CAM metrics. The results showed that MobileNetV2 achieved 96% accuracy with an error rate of 4.05%, balanced precision, recall, and F1-score of 0.96, and a loss of 0.60. Meanwhile, EfficientNet-B4 achieved an accuracy of 96.5% with an error rate of 3.47%, balanced precision, recall, and F1-score of 0.97, and a lower loss of 0.12. Further evaluation indicates that EfficientNet-B4 has the same error rate on both training and test data (3.47%) with a loss difference of 0.02, suggesting that the model performs well and does not experience overfitting. In MobileNetV2, the difference in error rate between training (3.28%) and test (4.05%) is relatively small (0.77%), indicating that this model also does not exhibit overfitting. Grad-CAM visualization reveals that EfficientNet-B4 focuses more on clinically relevant areas, whereas MobileNetV2 tends to capture global patterns. Thus, EfficientNet-B4 is considered superior in terms of accuracy and generalization, while MobileNetV2 is more computationally efficient
Copyrights © 2026