Cataract is one of the leading causes of visual impairment worldwide, and its detection using retinal images remains a critical challenge in medical image analysis due to variations in image quality and subjectivity in clinical assessment. This study aims to evaluate the impact of image preprocessing techniques, namely Contrast Limited Adaptive Histogram Equalization (CLAHE) and image upscaling, on the performance and interpretability of deep learning–based cataract classification models. Three convolutional neural network architectures—Inception-ResNetV2, EfficientNetB0, and ResNet-50—were assessed using a balanced dataset of 2,000 retinal images under two experimental settings: raw images and enhanced images. The models were evaluated using accuracy, precision, recall, and F1-score, while Gradient-weighted Class Activation Mapping (Grad-CAM) was employed to analyze model interpretability. Experimental results show that EfficientNetB0 achieved the highest accuracy on raw images (96%), followed by ResNet-50 (94%) and Inception-ResNetV2 (92%). After applying CLAHE and upscaling, ResNet-50 exhibited improved performance, reaching 95% accuracy, whereas EfficientNetB0 and InceptionResNetV2 experienced a decrease in accuracy to 83%. Grad-CAM visualizations indicate that all models consistently focused on clinically relevant regions associated with cataract characteristics. These findings demonstrate that image enhancement techniques do not universally improve classification performance and that their effectiveness is highly dependent on the underlying CNN architecture. The study provides practical insights for selecting appropriate preprocessing–model combinations to develop accurate, interpretable, and robust deep learning–based cataract classification systems for medical decision-support applications.
Copyrights © 2026