Deep learning facial emotion recognition (FER) is widely applied in healthcare, education, and human–computer interaction. However, many deep learning models suffer from suboptimal hyperparameter configurations that reduce accuracy and stability. This study proposes three deep residual recurrent fusion models that integrates residual blocks with recurrent neural networks (bidirectional long short-term memory (BiLSTM), long short-term memory (LSTM), and gated recurrent unit (GRU)) to capture both spatial and temporal features. A systematic hyperparameter optimization strategy was applied, tuning kernel size, filter size, recurrent units, batch size, learning rate, dropout, and weight decay to balance generalization and computational efficiency. The models were evaluated on four benchmark datasets: FER2013, FERPlus, RAF-DB, and CK+. The results show that optimized configurations achieved outstanding accuracy, reaching 99.85% on FER2013, 99.99% on FERPlus, and 100% on RAF-DB and CK+. These findings demonstrate that careful hyperparameter tuning significantly enhances feature extraction, mitigates vanishing gradient and overfitting issues, and improves generalization across diverse datasets. The proposed framework highlights the importance of optimization in advancing robust FER systems for real-world applications.
Copyrights © 2026