Introduction: Diabetic Retinopathy (DR) is a vision-threatening complication of diabetes that requires early and accurate diagnosis. Deep learning offers promising solutions for automating DR classification from retinal images. This study compares the performance of two convolutional neural network (CNN) architectures—ResNet-50 and DenseNet-121—for classifying DR severity levels. Methods: A dataset of 2,000 pre-processed and augmented retinal images was used, categorized into four classes: normal, mild, moderate, and severe. Both models were trained using two approaches: standard train-test split and Stratified K-Fold Cross Validation (k=5). Data augmentation techniques such as flipping, rotation, zooming, and translation were applied to enhance model generalization. The models were trained using the Adam optimizer with a learning rate of 0.001, dropout of 0.2, and learning rate adjustment via ReduceLROnPlateau. Performance was evaluated using accuracy, precision, recall, and F1-score. Results: ResNet-50 outperformed DenseNet-121 across all evaluation metrics. Without K-Fold, ResNet-50 achieved 84% accuracy compared to DenseNet-121’s 80%; with K-Fold, ResNet-50 scored 83% and DenseNet-121 81%. ResNet-50 also demonstrated better balance in class-wise classification, with higher recall and F1-score, especially for moderate and severe DR classes. Confusion matrices confirmed fewer misclassifications with ResNet-50. Conclusions: ResNet-50 provides superior accuracy and robustness in classifying DR severity levels compared to DenseNet-121. While K-Fold Cross Validation enhances model stability, it slightly reduces overall accuracy. These findings support the use of ResNet-50 in developing reliable deep learning-based screening tools for early DR detection in clinical practice