The rapid growth of multi-category e-commerce platforms has increased the importance of behavioral data for predicting Customer Lifetime Value (CLV). However, monetary-based CLV estimation is often infeasible due to incomplete or unavailable transaction records. This study adopts session duration as a short-term behavioral proxy for CLV and proposes a Bidirectional Long Short-Term Memory (Bi-LSTM) model enhanced with a Temporal Attention mechanism to improve predictive accuracy. The publicly available REES46 dataset, consisting of 1,6 million events and 276.000 unique sessions, is used with preprocessing steps including label encoding, temporal feature construction, and outlier-aware sampling to address the highly right-skewed distribution of session durations. Four baseline models Decision Tree, Random Forest, Extreme Gradient Boosting (XGBoost), and conventional Long Short-Term Memory (LSTM) are implemented for comparative evaluation. The baseline LSTM achieves MAE = 0,0080 and RMSE = 0,0322. The proposed Bi-LSTM v3 model, equipped with Temporal Attention and structured sampling, demonstrates substantial performance improvement, achieving MAE = 0,0043 (≈368 seconds) and RMSE = 0,0172 (≈1466 seconds), representing an accuracy gain of approximately 45–50% over the baseline. Explainability analysis using SHapley Additive exPlanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME) confirms that the time_diff feature is the dominant contributor at both global and local levels, aligning with the behavior of the attention mechanism. Additionally, the integration of Explainable Artificial Intelligence (XAI) provides transparent insights into model decision patterns. These findings show that combining Bi-LSTM, Temporal Attention, and XAI yields an accurate and interpretable framework for session duration prediction, supporting the use of session duration as a feasible CLV proxy in multi-category e-commerce environments.