Date of Award

8-19-2024

Document Type

Open Access Dissertation

Department

Computer Science and Engineering

First Advisor

Ioannis Rekleitis

Abstract

The main contributions of this thesis is the applicability and architectural designs of deep learning algorithms in underwater imagery. In recent times, deep learning techniques for object classification and detection have achieved exceptional levels of accuracy that surpass human capabilities. However, the effectiveness of these techniques in underwater environments has not been thoroughly researched. This thesis delves into various research areas related to underwater environments, such as object classification, detection, semantic segmentation, pose regression, and semi-supervised retraining of detection models.

The first part of the thesis studies image classification and detection. Image classification is a fundamental process that involves assigning a label to an image from a predetermined set of categories. Detection, on the other hand, refers to the process of locating an object within an image, along with its label. We have developed a coral classification model, MDNet, for object classification that is trained using point-annotated visual data and is capable of classifying eight species of corals. MDNet incorporates state-of-the-art convolutional neural network architectural designs that allow for acceleration on embedded devices. To further enhance its capabilities, we utilize the detection capability of MDNet along with Kernelized Correlation Filters (KCF)-based trackers to identify unique coral objects. For a given trajectory on the seafloor, we can track unique coral objects and estimate coral population density. This population estimation is a valuable tool for marine biologists to analyze the effects of climate change and water population on coral reefs over time. To deploy the system on embedded devices such as Aqua2, we have conducted a comprehensive study of available neural network accelerators based on field-programmable gate arrays (FPGAs) and optimized MDNet to achieve real-time performance. For object detection, we combine the output of the classifier model with a crowd-annotated dataset to develop a robust model for detecting relevant species of corals. We also test the generalization capability of models designed for underwater images in medical domain. The similar models were trained to classify and quantify nuclei from human blood neutrophils. The model achieved over 94% accuracy in differentiating different cell types.

Next part of the thesis explores and suggests on how to integrate deep learning based object detection with SLAM system to create semantic 3D map. A semantic 3D map is required for sea-floor exploration and coral reef monitoring systems. In our research, we integrate a coral reef detection algorithm with Direct Sparse Odometry (DSO), a Simultaneous Localization and Mapping system (SLAM) method. By combining the output of the detection system with DSO feature mapping, we have developed a semantic 3D map of the system that allows for efficient navigation and better coverage of the coral reef.

In the subsequent part of the thesis, we extend object detection neural networks to predict 6D pose of underwater vehicles. Pose regression, the process of predicting 6D poses, in deep learning involves using monocular images to predict the 3D location and orientation of an object. In order to facilitate cooperative localization, we have created a vision-based localization system called DeepURL for Aqua2 robots operating underwater. The DeepURL system first detects objects in the images and then predicts their 3D positions and orientations.

Finally, in the fourth part of the thesis, we have developed a semi-supervised approach for training the detection algorithm using a dataset with labels for a subset of samples. This allows the algorithm to use unlabeled visual data from future experiments and scuba diving. We have found that this semi-supervised approach has improved the performance and robustness of the detection algorithm.

The thesis aims at developing deep learning based object understanding in underwater environments while maintaining the generalization capability of the models. We demonstrate how object classification and detection can be redesigned and repurposed for underwater environments. We also provide intuitions behind the model design and evaluate against the state-of-the-art models.

Rights

© 2024, Md Modasshir

Share

COinS