Claim Missing Document
Check
Articles

Found 2 Documents
Search

Combining dual attention mechanism and efficient feature aggregation for road and vehicle segmentation from UAV imagery Nguyen, Trung Dung; Pham, Trung Kien; Ha, Chi Kien; Le, Long Ho; Ngo, Thanh Quyen; Nguyen, Hoanh
Bulletin of Electrical Engineering and Informatics Vol 13, No 3: June 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v13i3.6742

Abstract

Unmanned aerial vehicles (UAVs) have gained significant popularity in recent years due to their ability to capture high-resolution aerial imagery for various applications, including traffic monitoring, urban planning, and disaster management. Accurate road and vehicle segmentation from UAV imagery plays a crucial role in these applications. In this paper, we propose a novel approach combining dual attention mechanisms and efficient multi-layer feature aggregation to enhance the performance of road and vehicle segmentation from UAV imagery. Our approach integrates a spatial attention mechanism and a channel-wise attention mechanism to enable the model to selectively focus on relevant features for segmentation tasks. In conjunction with these attention mechanisms, we introduce an efficient multi-layer feature aggregation method that synthesizes and integrates multi-scale features at different levels of the network, resulting in a more robust and informative feature representation. Our proposed method is evaluated on the UAVid semantic segmentation dataset, showcasing its exceptional performance in comparison to renowned approaches such as U-Net, DeepLabv3+, and SegNet. The experimental results affirm that our approach surpasses these state-of-the-art methods in terms of segmentation accuracy.
Real-Time Prohibited Item Detection in X-ray Security Screening via Adaptive Multi-scale Feature Fusion and Lightweight Dynamic Convolutions Nguyen, Hoanh; Ha, Chi Kien
Journal of Robotics and Control (JRC) Vol. 6 No. 4 (2025)
Publisher : Universitas Muhammadiyah Yogyakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.18196/jrc.v6i4.27030

Abstract

Prohibited item detection in X-ray security screening is a challenging task due to the diverse shapes, sizes, and materials of concealed objects. In this paper, we propose a novel end-to-end framework, integrating adaptive multiscale convolution blocks (AMC Block) and an adaptive lightweight convolution module (ALCM), to address these challenges with high accuracy and efficiency. The AMC block leverages parallel convolutional paths with varying kernel sizes and dilation rates, enabling the capture of both fine-grained and large-scale features. This multiscale strategy ensures that small items like wires and larger objects such as bags or metallic weapons are equally well-detected. Building on top of multi-stage features extracted by the AMC block, we introduce the ALCM to refine and fuse feature maps at different pyramid levels. The ALCM employs a dynamic weight generator (DWG), which adaptively assigns importance to multiple convolutional kernels based on local content, followed by multi-scale depthwise convolutions (MSDC), a lightweight mechanism that enriches features across scales using parallel convolutions with different receptive fields. This approach enhances spatial context while keeping the parameter overhead minimal. Experimental results on two public large-scale X-ray datasets, OPIXray and HiXray, demonstrate that our method achieves state-of-the-art performance while maintaining real-time inference speed. Specifically, our model achieves 91.2% mAP@0.5 and 78.4% mAP@0.5:0.95 on OPIXray, and 87.3% mAP@0.5 and 73.5% mAP@0.5:0.95 on HiXray, outperforming strong baselines including YOLOv9 and Faster R-CNN. Despite competitive accuracy, our model remains efficient with 92.0 GFLOPs and 42 FPS. Furthermore, we examine the generalizability of our system across varied X-ray imaging settings and discuss failure cases such as false negatives in cluttered environments. These findings highlight the practical applicability of our approach for deployment in real-world security checkpoints, striking a strong balance between detection accuracy and computational efficiency.