Facebook AI Image Similarity Challenge - Getting Started

In this post, we will introduce the Facebook AI Image Similarity Challenge and highlight some resources to help you get started.

Jay Qi
Lead Data Scientist

Content tracing is a crucial component on all social media platforms today, used for such tasks as flagging misinformation and manipulative advertising, preventing uploads of graphic violence, and enforcing copyright protections. But when dealing with the billions of new images generated every day on sites like Facebook, manual content moderation just doesn't scale. They depend on algorithms to help automatically flag or remove bad content.

In our new Facebook AI Image Similarity Challenge, hosted in partnership with Facebook AI, your task is to create models for detecting whether a query image is derived from one of the images in a large corpus of reference images.

This blog post will provide an overview of the challenge's two tracks and highlight some resources to help you get started.

A pair of original and manipulated images. The original image is a photo of a koala, and the manipulated image is a copy which has been rotated and has koala emoji overlaid.
Example of the kind of image pair that the Image Similarity Challenge asks competitors to detect. The original image has been rotated and combined with another image to create the manipulated image.

The challenge

Here is a quick introduction to the tracks, data, and phases of the challenge.

Two tracks

Unlike most DrivenData competitions, the Image Similarity Challenge has two tracks:

  • In the unconstrained Matching Track, your objective is just the overall objective: identify which query images are derived from which reference images. You are free to use whatever approach you want to achieve this objective (aside from some rules for ensuring that solutions are generalizable.)
  • In the constrained Descriptor Track, your submission will be descriptors (vector representations with at most 256 dimensions) for each image in the query and reference sets. These will be used in a similarity search to achieve the overall objective of finding the query–reference image pairs.

Using image descriptors is one common type of approach for image similarity detection. This means that you can potentially develop one solution to submit to both tracks of the challenge. However, the Matching Track gives you more freedom in your solution, and you may find this freedom lets you outperform Descriptor Track solutions. You might use this freedom as simply as having more than 256 descriptor dimensions. Or, you can potentially develop an entirely different approach that does not use descriptors at all.

There is a $100,000 prize pool for the best solutions in each track. You are welcome to participate in just one track or in both—it's up to you! You can read more about the details of the Matching Track and Descriptor Track on their respective "Problem Description" pages.

Datasets provided

For this competition, the image datasets provided can be thought of under two main categories:

  • The query and reference sets: these are images that will be used for evaluation.
  • The training set: these are additional images that you are free to use in any way for developing your solutions.

The query set is 50,000 images. Some of these images are transformed versions of images from the 1 million reference set, and the task of the challenge is to identify them. We provide ground truth labels for half of the query images for you to use for development evaluation, while the other half is held out for the leaderboard. For the final rankings in Phase 2 of the challenge (more on this in the next section), there will be a new unseen query set of 50,000 images.

Submissions to this challenge will need to generate scores assessing the likelihood that query–reference image pairs come from the same source. (In the Descriptor Track, this is done by evaluating the Euclidean distance between the image descriptors.) These submissions will be evaluated by micro-average precision on the predicted pairs ranked globally by confidence score.

Image pairs showing different compositions of image transformations applied to create a manipulated image from an original image.
Examples of how many different image transformations can be composed together to create the manipulated query images.


Unlike for a typical supervised machine learning competition, developing a model for this challenge is not as simple as just training a model on provided labeled training data. The query and reference sets are intended for evaluation and not for training. Again, please ensure that you read the rules on data use describing how solutions can use the query and reference images. The purpose of these rules is to make sure that solutions are not overfitted on the evaluation data and can generalize to new data. After all, any practical real-world system will have to continuously deal with new, previously unseen images.

We expect that successful competitors will need to be creative in assembling training data. The training image set of 1 million additional images is provided as a starting place for that purpose. For example, you can use image transformation tools (like the AugLy library described below) on training set images to generate your own labeled image pairs for supervised approaches. Finally, your are also allowed to bring in external datasets, as long as they comply with the rules on external data.

For further details about the provided datasets check out the "Data" section of the "Problem Description" pages (Matching Track; Descriptor Track). You can find instructions to download the data on the "Data Download" pages after accepting the data license agreement.

Two phases

In addition to being split into two tracks, the Image Similarity Challenge is also split into two phases. Participants have access to the research dataset in Phase 1 for model development, and can make submissions to the Phase 1 leaderboard. The final leaderboard and prizes will be determined by Phase 2, which runs for 48 hours from October 26–27, 2021 and for which a new unseen query image set will be provided.

Note that for Phase 2 eligibility, we will require solutions to be finalized at the end of Phase 1. You will need to submit your code (and reference image descriptors for the Descriptor Track) before Phase 2. You will have up to three submissions for the final phase. The top-performing solutions for Phase 2 will win from a $200,000 total prize pool for the challenge!

Resources

Finally, here are some resources to help you get started on developing your solutions.

Benchmark models examples

For example solution implementations, check out the facebookresearch/isc2021 repository on GitHub, published by our partners at Facebook AI accompanying their paper about this challenge. This repository has code and walkthroughs for three baseline models:

  • GIST descriptors: a traditional computer vision model first published in 2001 that computes descriptors for a particular set of perceptual dimensions.
  • MultiGrain: a pretrained ResNet-50 convolutional neural network trained on ImageNet for image classification and image retrieval, published in 2019.
  • HOW+ASMK: a model that represents images using aggregated local descriptors based on convolutional neural network activations, published in 2020.

The best place to start is the docs/ subdirectory which has walkthroughs for reproducing the results of the baseline models for the competitions. Scores from these benchmark models can be seen on the challenge leaderboard.

Screenshot of Image Similarity Challenge Leaderboard showing scores from the MultiGran and GIST benchmark models.
Scores from benchmark models on the challenge leaderboard.

Generating your own training data with AugLy

Our partners at Facebook AI have published an open-source Python library AugLy that you may find helpful for this challenge. AugLy is a general data augmentation library with one of its supported modalities being transformations of images. For example, you can use AugLy on the provided training set images to generate your own "query–reference" image pairs and use these for supervised machine learning. You can learn more about AugLy from its GitHub repository or its release announcement blog post.

Examples of data augmentations that AugLy can apply to an image.
Examples of data augmentations that AugLy can apply to an image. From the AugLy repository documentation.

Good luck!

Visit the homepages for the Image Similarity Challenge Matching Track and Descriptor Track to sign up and read about the challenge in detail. If you have any questions, you can visit the competition forum. Good luck, and enjoy the challenge!


Update July 13, 2021: Updated benchmark section to reflect addition of HOW+AMSK to the facebookresearch/isc2021 repository.

Stay updated

Join our newsletter or follow us for the latest on our social impact projects, data science competitions and open source work.

There was a problem. Please try again.
Subscribe successful!
Protected by reCAPTCHA. The Google Privacy Policy and Terms of Service apply.

Latest posts

All posts

winners

Meet the winners of Phase 2 of the PREPARE Challenge

Learn about how winners detected cognitive decline using speech recordings and social determinants of health survey data

resources

Open-source packages for using speech data in ML

Overview of key open-source packages for extracting features from voice data to support ML applications

tutorial

Getting started with LLMs: a benchmark for the 'What's Up, Docs?' challenge

An introduction to using large language models via the benchmark to a document summarization challenge.

winners

Meet the Winners of the Goodnight Moon, Hello Early Literacy Screening Challenge

Learn about the results and winning methods from the early literacy screening challenge.

resources

Where to find a data job for a good cause

Finding data jobs for good causes can be difficult. Learn strategies, job lists, and tips to find organizations with open positions working on causes you care about.

winners

Meet the Winners of the Youth Mental Health Narratives Challenge

Learn about the winning solutions from the Youth Mental Health Challenge Automated Abstraction and Novel Variables Tracks

winners

Meet the winners of the Forecast and Final Prize Stages of the Water Supply Forecast Rodeo

Learn about the winners and winning solutions from the final stages of the Water Supply Forecast Rodeo.

insights

10 takeaways from 10 years of data science for social good

This year DrivenData celebrates our 10th birthday! We've spent the past decade working to use data science and AI for social good. Here are some lessons we've learned along the way.

tutorial

Goodnight Moon, Hello Early Literacy Screening Benchmark

In this guest post from the MIT Gabrieli Lab, we'll show you how to get started with the literacy screening challenge!

tutorial

Youth Mental Health: Automated Abstraction Benchmark

Learn how to process text narratives using open-source LLMs for the Youth Mental Health: Automated Abstraction challenge

winners

Meet the winners of Phase 1 of the PREPARE Challenge

Learn about the top datasets sourced for Phase 1 of the PREPARE Challenge.

resources

Teaching with DrivenData Competitions

Inspiration and resources for teaching students data science, machine learning, and AI skills with DrivenData competitions.

winners

Meet the winners of the Pose Bowl challenge

Learn about the top solutions submitted for the Pose Bowl: Spacecraft Detection and Pose Estimation Challenge.

winners

Meet the winners of the Water Supply Forecast Rodeo Hindcast Stage

Learn about the winning models for forecasting seasonal water supply from the first stage of the Water Supply Forecast Rodeo.

tools

Cookiecutter Data Science V2

Announcing the V2 release of Cookiecutter Data Science, the most widely adopted data science project template.

resources

How to make data science projects more open and inclusive

Key practices from the field of open science for making data science work more transparent, inclusive, and equitable.

winners

Meet the winners of the Kelp Wanted challenge

Dive into the solutions from the super segmenters who best detected kelp in Landsat imagery!

winners

Meet the winners of the SNOMED CT Entity Linking Challenge

Meet the winners with the best systems for detecting clinical terms in medical notes.

winners

Meet the winners of the Pale Blue Dot challenge

Learn about the top visuals created for the Pale Blue Dot: Visualization Challenge and the solvers behind them.

tutorial

NASA Pose Bowl - Benchmark

An introduction to the NASA Pose Bowl competition, with a benchmark solution for the object detection track

Work with us to build a better world

Learn more about how our team is bringing the transformative power of data science and AI to organizations tackling the world's biggest challenges.