Claim Missing Document
Check
Articles

Found 2 Documents
Search
Journal : Journal of Robotics and Control (JRC)

Ovarian Tumors Detection and Classification on Ultrasound Images Using One-stage Convolutional Neural Networks Le, Van-Hung; Pham, Thi-Loan
Journal of Robotics and Control (JRC) Vol 5, No 2 (2024)
Publisher : Universitas Muhammadiyah Yogyakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.18196/jrc.v5i2.20589

Abstract

Currently, the advent of CNN (Convolutional Neural Network) has brought very convincing results to computer vision problems. One-stage CNNs are a suitable choice for research and development to have an overview of the current results of the process of detecting and classifying OTUM from ovarian ultrasound images. In this paper, we have performed a comprehensive study on one-stage CNNs for the problem of detecting and classifying OTUM on ovarian ultrasound images. The OTUM datasets we tested were two popular OTUM datasets: OTU and USOVA3D. The one-stage CNNs we tested and evaluated belong to the YOLO (You Only Look Once) family (YOLOv5, YOLOv7, YOLOv8 variations, and YOLO-NAS), and the SSD (Single Shot MultiBox Detector) family (VGG16-SSD, Mb1-SSD, Mb1-SSDLite, Sq-SSD-Lite, and Mb2-SSD-Lite). The results of detecting OTUM (with or without OTUM on ovarian ultrasound images) are high (with Mb1-SSD of Acc = 98.90%, P = 98.58%, R = 98.9% on “USOVA3D 2D f r1 80 20” set; with Mb2-SSD-Lite of Acc = 97.87%, P = 97.16%, R = 97.87% on “USOVA3D 2D f r2 80 20” set). The results of detecting and classifying OTUM into 8 classes are low (the highest is Acc = 92.04%, P = 74.81%, R = 92.04% on the OTU-2D dataset). Regarding computation time, CNNs of the YOLO family have faster computation times than networks of the SSD family. The above results show that the problem of classifying ovarian tumors on ultrasound images still contains many challenges that need to be resolved in the future.
Visual Slam and Visual Odometry Based on RGB-D Images Using Deep Learning: A Survey Le, Van-Hung
Journal of Robotics and Control (JRC) Vol 5, No 4 (2024)
Publisher : Universitas Muhammadiyah Yogyakarta

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.18196/jrc.v5i4.22061

Abstract

Visual simultaneous localization and mapping (Visual SLAM) based on RGB-D images includes two main tasks: building an environment map and simultaneously tracking the location/motion trajectory of the image sensor, or called visual odometry (VO). Visual SLAM and VO are used in many applications as robot systems, autonomous mobile robots, supporting systems for the blind, human-machine interaction, industry, etc. With the strong development of deep learning (DL), it has been applied and brought impressive results when building Visual SLAM and VO from image sensor data (RGB-D images). To get the overall picture of the development of DL applied to building Visual SLAM and VO systems. At the same time, the results, challenges, and advantages of DL models to solve Visual SLAM and VO problems. In this paper, we proposed the taxonomy to conduct a complete survey based on three methods from RGB-D images: (1) using DL for the modules (depth estimation, optical flow estimation, visual odometry, mapping, and loop closure detection) of the Visual SLAM and VO framework; (2) using DL modules to supplement (feature extraction, semantic segmentation, pose estimation, map construction, loop closure detection, others module) to Visual SLAM and VO framework; (3) using end-toend DL to build Visual SLAM and VO systems. The studies were surveyed based on the order of methods, datasets, and evaluation measures, the detailed results according to datasets are also presented. In particular, the challenges of studies using DL to build Visual SLAM and VO systems are also analyzed and some of our further studies are also introduced.