Short-term electric load forecasting plays a vital role in ensuring the stability and efficiency of smart grid operations. However, accurately predicting demand remains challenging due to nonlinearity, volatility, and long-term temporal dependencies in consumption patterns. The research proposes a lightweight hybrid deep learning model that integrates a Transformer encoder with a multi-layer perceptron (MLP) to enhance prediction accuracy and robustness for short-term load forecasting. The proposed model employs a Transformer to extract long-range temporal features through self-attention mechanisms, while the MLP captures complex nonlinear mappings at the output stage. A real-world electricity load dataset collected from three Australian states (NSW, QLD, VIC) between 2009 and 2014 is used for evaluation. To assess model performance, mean absolute percentage error (MAPE), mean squared error (MSE), and Root Mean Squared Error (RMSE) are used. Experimental results demonstrate that the proposed transformer-MLP model consistently achieves the lowest forecasting error across all regions. MAPE ranges from 0.69% to 0.95%, outperforming standard deep learning models, including LSTM, CNN, and MLP. Despite its shallow architecture and reduced computational complexity, the hybrid model effectively captures both temporal dependencies and nonlinear variations. This study provides a practical, deployable forecasting solution for smart grids. Future work will extend the model to multi-step forecasting, incorporate exogenous variables such as weather and calendar effects, and explore deeper Transformer variants further to enhance prediction accuracy and generalization across diverse load conditions.