Visual Mapping of the Ocean Floor

Traditionally, acoustical techniques have been used for mapping the ocean floor, e.g. using echo sounders. Our goal is to complement the acoustic information by establishing visual mapping techniques (and ultimately to combine the best of both worlds). They are directly understandable by human observers and they provide very high resolution 3D models that allow for measuring distances, surfaces, volumes etc. While e.g. multibeam echo sounders require external sensors for localization, the motion of a camera can be derived from the video sequence itself, which is called simultaneous localization and mapping (SLAM) or "structure and motion". Our goal is to map large deep sea environments with these methods in order to see what the status is and what is changing.

One important and dynamic environment are black smoker fields. Here extremely hot water escapes from deep under the seafloor and the carried minerals create large structures, e.g. 20m high towers, in these environments. These black smokers are both biologically and geologically of very high interest and can also contain high amounts of resources. Understanding their growth and how the habitats around them behave is a central research question that can be tackled using visual methods and also visual-acoustic sensor fusion. During the research cruise Falkor-160320 that brought us close to Tonga, we have visually and acoustically scanned an old crater with 500m diameter that contains an entire black smoker field. The videos for this cruise can be seen here:

From this, and also from other cruises, we have tremendeous amounts of video and photo material. Unfortunately, deep sea navigation data from external sensors is very inaccurate and so we have to use the visual data in order to refine the navigation. The general principle used is that corresponding seafloor points are seen and identified in several subsequent images, and also when the robot comes back to the same place. These correspondences provide geometric constraints on the camera motion, and thus on the robot motion. Once the motion is recovered, dense depth estimation techniques can be used to estimate the distance of each pixel in each image to the camera, to fuse these estimates and to finally create a 3D model of the environment.

We are currently looking for students or HiWis to work on several sub-problems of this, see "Student Opportunities".