Graphical Abstract Highlight Research 1. This study demonstrates the successful use of a self-developed Remotely Operated Vehicle (ROV) to acquire underwater imagery for monitoring floating net cage conditions without the need for manual diving. 2. The integration of ROV-based image acquisition with deep learning classification using the YOLOv8 model achieved high accuracy in identifying different levels of net fouling under real aquaculture field conditions. 3. The results show that classification performance decreases as the number of fouling classes increases, influenced by visual similarity between classes, environmental variability, and limited image distribution. 4. The proposed approach provides an effective, non-invasive, and practical monitoring solution that supports timely maintenance decisions and contributes to more sustainable management of floating net cage aquaculture systems.  Abstract Monitoring the condition of floating net cages (FNC) is essential for maintaining water circulation, dissolved oxygen availability, and overall fish health in aquaculture systems. However, FNC-based aquaculture commonly faces the problem of biofouling accumulation, including barnacles, algae, sediment, dirt, and solid waste, which gradually obstruct water flow and reduce cage performance. This study aimed to develop an automated method for classifying floating net cage fouling conditions by integrating a self-developed remotely operated vehicle (ROV) with deep learning–based underwater image classification. Underwater monitoring produced 7,156 extracted image frames, which were processed through image selection and white balance color correction. A total of 741 images were used to train a YOLOv8 model under three classification schemes, namely 2-class, 3-class, and 6-class classifications. The results demonstrated high classification performance across all schemes, with accuracy values of 100% for the 2-class model, 99% for the 3-class model, and 98% for the 6-class model. These findings indicate that integrating ROV-based image acquisition with deep learning classification provides an effective approach for assessing floating net cage conditions, enabling timely maintenance, improving monitoring efficiency, and supporting better environmental management in aquaculture systems. Future studies are encouraged to expand the dataset size and environmental variability to further enhance model robustness.