Asep Surahmat
Faculty Technology and Design, Utpadaka Swastika University, Indonesia

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

CCTV-Based River Waste Detection Using a Hybrid CNN–Graph Attention Network with Spatial–Contextual Feature Learning Asep Surahmat; Lukas Umbu Zogara; Fajar Muttaqi
Jurnal Teknik Informatika (Jutif) Vol. 7 No. 3 (2026): JUTIF Volume 7, Number 3, June 2026
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2026.7.3.5544

Abstract

River waste accumulation has become a serious environmental problem in urban areas, particularly in highly polluted rivers such as the Angke River in Tangerang, where floating waste disrupts ecological balance and increases flood risk. Conventional computer vision–based detection methods often fail under dynamic river conditions due to water surface reflections, turbulence, occlusion, and visually ambiguous debris. This study aims to improve the accuracy and robustness of river waste detection by proposing a hybrid deep learning framework that integrates convolutional and graph-based spatial–contextual reasoning. The proposed method utilizes a ResNet50 backbone for feature extraction from CCTV imagery, followed by spatial graph construction that models adjacency relationships between image regions. A Graph Attention Network (GAT) is then applied to capture contextual dependencies and refine feature representations prior to classification. Unlike conventional CNN-only or YOLO-based detectors that rely primarily on local visual cues and bounding-box representations, the proposed approach explicitly models spatial–contextual relationships between image regions through graph-based attention mechanisms. Experiments were conducted on 4,200 CCTV image frames collected from the Angke River under varying environmental conditions. The proposed model achieved an accuracy of 92.4%, precision of 91.1%, recall of 93.2%, F1-score of 91.9%, and a mean Average Precision (mAP) of 0.78, outperforming CNN-only and YOLO-based baseline models. These findings highlight the contribution of graph-enhanced visual reasoning to the fields of Computer Vision and Intelligent Surveillance, particularly for real-time environmental monitoring systems operating in complex and dynamic visual environments.