This research investigates advancements in Facial Expression Recognition (FER) within the domain of affective computing, focusing on improving the accuracy and robustness of FER systems under diverse, real-world conditions. Facial expressions serve as critical non-verbal cues in human communication, yet existing FER systems often face challenges due to environmental variability such as changes in lighting, pose, and occlusions. This study evaluates the performance of three Convolutional Neural Network (CNN) architectures—ResNet50, VGG16, and MobileNetV3Large—integrated with preprocessing techniques like Contrast Limited Adaptive Histogram Equalization (CLAHE) and the Synthetic Minority Oversampling Technique (SMOTE). These methods address key challenges such as class imbalance and low contrast in datasets. Results demonstrate the pivotal role of tailored preprocessing strategies. For instance, the application of CLAHE and SMOTE improved the VGG16 model's test accuracy from 0.70 to 0.79, representing a 0.09 or 9% increase. This significant improvement underscores the effectiveness of combining advanced preprocessing methods with CNN architectures. Furthermore, the findings highlight the advantages of optimizing preprocessing to enhance the recognition of subtle emotions in uncontrolled settings, offering practical insights for deploying FER systems in real-time applications. Overall, this research demonstrates the potential of preprocessing techniques to enhance FER system performance significantly, particularly when paired with well-established deep learning models. These insights pave the way for the development of more accurate, robust, and adaptable FER systems capable of functioning reliably in dynamic, real-world environments.