Creating an ocean of data with the help of underwater robots

When your mission is monitoring life in the vastness of the ocean, conventional technology can only take you so far.

For three decades, a team at the Monterey Bay Aquarium Research Institute (MBARI) has been collecting tens of thousands of hours of video data and building an annotation reference system to sift through it all. 

“It's a wealth of information, but nobody outside of our firewalls can access it—at least not yet,” says Dr. Kakani Katija, leader of the Bioinspiration Lab at MBARI, which develops novel instruments and techniques for studying deep sea life

Like a lot of knowledge workers, scientists can become siloed, missing out on potentially useful findings discovered just outside their field of focus. So how might they break through barriers to ensure everyone who could benefit from the data has access to it? That’s the question Katija and her team are trying to answer. 

In search of breakthrough ways to collect, analyze, and share data, they’ve been developing technologies that combine artificial intelligence and robotic underwater vehicles. With the help of tools like EyeRIS—which uses lightfield imaging to capture 3D video in deep, dark ocean water—they’re revealing views that have never been possible until now. 

Last July, the Bioinspiration Lab launched FathomNet, an open-source image database designed to make MBARI’s data more broadly available to scientists worldwide. Here’s a glimpse into how Katija and her team plan to spark curiosity, encourage participation, even gamify parts of the project—all in service of saving the oceans. 

“It's a wealth of information, but nobody outside of our firewalls can access it—at least not yet.” —Dr. Kakani Katija

What drew you to your role at the Bioinspiration Lab?
My background is in aerospace engineering. I thought that might be the easiest pathway to becoming an astronaut someday. I was always enamored by ‌this idea of discovering new life and exploring new places. But space isn't the only place where that's happening. The ocean is full of life that we know nothing about. In some cases, that life is unknown to science. 

That's part of why I do a lot of technology development. What are the observational gaps? How do we fill them? What are ‌the needs of the marine science community that we can address ‌through technology? MBARI is kind of a perfect fit for that, because the whole institution is heavily involved in the development of new technologies or approaches to understanding the ocean.

When did you first become interested in exploring the use of ML and AI for ocean exploration?
For me, it's been over five years. But from an organizational standpoint, we've spent 30 years collecting visual data and having people process that data manually. MBARI is already sitting on this massive data resource. How do we leverage it to help solve much broader problems, not just our own? We wouldn't have had this starting point if it weren’t for the foresight of our founder, David Packard, who said we need to make sure we've built a video annotation reference system that allows us to search this data. 

My longish term goal is to create autonomous vehicles that can go out and search for new life. Being able to distinguish between known life and unknown life requires new algorithms we don't currently have. So we need to create libraries around what is the life we know, then make it easier to have human experts in the loop telling us when we don't know something. 

You’ve mentioned how there’s been a paradigm shift in your field driven by robotics. Could you walk us through the evolution of technology in ocean exploration?
Ocean exploration has predominantly been done by big organizations that have access ‌to the ocean. It's astonishing to me that an entire continent like Australia doesn't have a research-class, remotely operated vehicle on their shores. That means the outcomes have been somewhat restrictive. 

We've only achieved biological monitoring of maybe 7% of the ocean surface collectively. So we’re having to rethink how we complete distributed observations and avenues for people who might not be at these large ocean institutions but have the right to participate in this process. That's why we’re moving from research-class vessels and remotely operated vehicles to looking at autonomous systems that could be deployed anywhere. 

But also, we’re thinking more holistically around, how do you get the data? Our group took a look at how we were doing things and how we want to move forward. We've pivoted a bit to thinking more about how we improve access. There's this focus on data analysis and data pipelines, but also, how do we get what we do on these big, remotely operated vehicles onto autonomous systems that can scale better. We need to make sure tools that we spend a lot of time developing actually get used and adopted by the broader community. 

Why are you passionate about bringing diverse scientific communities together?
The problems we’re trying to address in the ocean are so massive that no individual group or institution will be able to solve them. Figuring out ways to collaborate effectively with groups is really important. This is part of why FathomNet exists. A lot of different groups are collecting a lot of data to answer their particular science question. But that science question might be really stove-piped and focused. 

The thing about visual data is that there's a plethora of uses for that information. Like, you might be a fish biologist collecting data on understanding fish distributions. But you're also collecting information on the habitat. There are other types of animals like corals or sponges that could be captured in the same visual data. If you don't make that data more accessible, you're not able to fully extract the information from it. In the beginning, it's very expensive and difficult to collect visual data in the ocean. But once you collect it, there's also a lot of value, and not only the science that comes out of it, but also how it could be shared and used by other people. 

With FathomNet, we tried to build an ecosystem around visual data sharing, particularly visual data that’s been labeled so it can be used by marine scientists, taxonomists who are curious about animal distributions or new species, but also computer scientists and data scientists—people who know algorithms. How do we make it easier to spark new collaborations between the ocean science community and computer science community? The goal of FathomNet is to enable these disparate groups that have real challenges to work together and solve them.

What’s your process for collaborating with other researchers and scientists? 
It's always a work in progress. For instance, FathomNet is one project. The list of core collaborators includes people in the for-profit space, the nonprofit space, academia, and governments. It's challenging to have these different groups and perspectives work together on any project. The number one thing I try to understand at the outset is, what is the motivation for this group? What is it they need to be successful? Once we all understand that, the process of coming up with solutions is a little easier. 

“The goal of FathomNet is to enable disparate groups that have challenges to work together and solve them.”

Then, we also take an honest look at the knowledge base of the current team and ask what’s missing. FathomNet was the first approach. Then we realized, that's great, but it's not really engaging. How do I engage enthusiasts who want to help with the data processing pipeline fathom—people who don't have computer science backgrounds or know how to set up machine learning pipelines? How can they use FathomNet for their use case? 

That's why we started to create this Ocean Vision AI program where FathomNet is one of the products. We're building a portal where people can upload their own visual data, link it in with ML Processing pipelines and have an end-to-end solution for their analysis needs. The last thing we're creating is FathomVerse. It's a video game that people can play on their phone to help contribute to improving the artificial intelligence that we use to observe life in the ocean.

How did you build a team with the AI expertise necessary to create FathomNet?
Our original team didn't have expertise in human-AI interactions. How do you best collaborate with artificial intelligence systems? We didn't have expertise in engagement. Most of us are scientists. We write papers. We dabble in science communication, but we don't know about building community. How do you come up with the right message? What is really engaging? 

Like with video games—there's a science behind how you build a game to be engaging and change player behavior towards a particular outcome. Sometimes those are profit driven, but in our case, it's data driven. We quickly figured out we needed different perspectives. It takes a lot of time, building those relationships, building trust, and getting to know people in these various communities. But there's this level of comfort with just not knowing, being okay with that lack of knowledge, and reaching out to people who are experts in that field.

How did you come up with the idea to add elements of gamification?
The game, FathomVerse will hopefully launch in April 2024. None of us have done this, but we know there are certain things that really resonate with people. The search for life, the search for connection is powerful. So we're hoping that will be helpful in giving people some way to interact and participate, as opposed to just watching YouTube videos. Like, here's a way you can actually get involved.

I thought a lot about science communication around the ocean versus, like, space exploration. And sadly, in most environmental programming, it's all doom and gloom—and for good reason. This situation is pretty dire. But that, unfortunately, isn't inspiring. So can we create a call to action that's not only inspiring, but also a real way for people to contribute to better understanding our our ocean and helping us monitor the ocean as it changes because of our activities?

Looking ahead, what are the next repetitive tasks you'd like to automate?
The annotation pipeline is a tough one. It's a long problem. But our group is really focused on, how do we get these algorithms running on vehicles in real time? We could be deploying vehicles to achieve a persistent presence, but if you're collecting an infinite amount of visual data, what do you do with this information? Can we convert it on the fly into names of animals and counts of animals? Can we do real-time monitoring? 

The ocean is a challenge because of water. It's communication-constrained. If we're pushing data off of our vehicles that are words or counts as opposed to actual images and video, that would be a huge improvement in our capabilities. So how do we take remotely operated vehicles that require a human in a loop to fully operate but do the same thing in a more automated fashion on an autonomous system that only needs human input when it's absolutely required? 

What part of your job do you enjoy so much, you don’t want it to be automated? 
Going offshore. Whenever I talk to people about the future of ocean exploration, I don't think autonomous systems are going to completely replace these remote systems. I think we need a big toolbox. These autonomous systems will allow us to survey an area much more quickly. But at the end of the day, we need eyes on things. We need to manipulate those things. These autonomous systems won't be able to to do that. But going offshore and seeing something in a completely new light is something I think automation will never replace.

This interview has been edited and condensed for clarity.

This article was originally featured on the Dropbox blog.


Leave your thoughts below to get the conversation started!

Comments