This study presents a comparative evaluation of three hybrid deep learning models for human activity recognition (HAR) in free-living and highly imbalanced conditions: 1DCNN-ResBLSTM-Attention (Model A), Attention-Mechanism-Based Deep Learning Feature Combination (Model B), and Time-Reversal-1DCNN-ResLSTM-Attention (Model C). Each architecture integrates convolutional layers for feature extraction, recurrent networks for temporal modeling, and attention mechanisms to enhance relevant representations. The HARTH v2.0 dataset, comprising 31 subjects and 15 activity classes under strong class imbalance, is used for evaluation. Results show that soft labeling consistently improves performance by better capturing transitional uncertainty in windowed sensor data. Model A achieves the highest accuracy (96.21%) and macro-averaged F1-score (88.17%), followed by Model C with comparable performance at lower computational cost, while Model B underperforms on minority classes due to limitations of spectrogram-based representations. Across all models, persistent confusion is observed among activities with similar motion patterns, such as walking, standing, and shuffling, indicating intrinsic ambiguity in sensor signals. This study provides a controlled and standardized comparison of hybrid architectures under realistic conditions, revealing both performance trade-offs and shared limitations. The findings highlight the importance of modeling uncertainty and temporal context for improving robustness, particularly transitional and underrepresented activities.
Copyrights © 2026