Kelp Wanted Challenge Starter Code
Getting Started with MATLAB
We at MathWorks, in collaboration with DrivenData, are excited to bring you this challenge! The goal is to develop an algorithm that can use provided satellite imagery to predict where kelp is present and where it is not. Kelp is a type of algae that often grows in clusters known as kelp forests, which provide shelter and stability for many coastal ecosystems. The presence and growth of kelp is an important measurement for evaluating the health of these ecosystems, so the ability to easily and consistently monitor kelp forests could be a huge step forward in coastal climate science. In this blog, we will explore the data using the Hyperspectral Viewer app, preprocess the dataset, then create, evaluate, and use a basic semantic segmentation model to solve this challenge. Note that this model was trained on a subset of the data, so the numbers and individual file and folder names may be different from what you see in the full competition dataset. To request your complimentary MATLAB license and access additional learning resources, check out this website! Table of Contents:
- Explore and Understand the Data
- Import the Data
- Preprocess the Data
- Design and Train a Neural Network
- Evaluate the Model
- Create Submissions
Explore and Understand the Data
Instructions for accessing and downloading the competition data can be found here. Let's read in a sample image and label for tile ID AA498489, which we will explore to gain a better understanding of the data. firstImage = imread('train_features/AA498489_satellite.tif');
firstLabel = imread('train_labels/AA498489_kelp.tif');
The Input: Satellite Images
The input data is a set of augmented satellite images that have seven layers or "bands", so you can think of it as 7 separate images all stacked on top of each other, as shown below
Each band is looking at the same exact patch of earth, but they each contain different measurements. The first 5 bands contain measurements taken at different wavelengths of the light spectrum, and the last two are supplementary metrics to better understand the environment. The following list shows what each of the seven bands measures:
- Short-wave infrared (SWIR)
- Near infrared (NIR)
- Red
- Green
- Blue
- Cloud Mask (binary - is there cloud or not)
- Digital Elevation Model (meters above sea-level)
Typically, most standard images just measure the red, green, and blue values, but by including additional measurements, hyperspectral images can enable us to identify objects and patterns that may not be easily seen with the naked eye, such as underwater kelp. For more detail on what each band captures, check out the competition�s problem description page. The Spectral Bands (1-5)
Let's start by exploring the first five layers. The rescale function adjusts the values of the bands so that they can be visualized as grayscale images, and the montage function displays each band next to each other. montage(rescale(firstImage(:, :, 1:5)));