blog winnerscompetition

Meet the winners of the Pose Bowl challenge

Learn about the top solutions submitted for the Pose Bowl: Spacecraft Detection and Pose Estimation Challenge.

Hannah Moshontz
Senior Program Manager
Jay Qi
Lead Data Scientist

Background

Inspector spacecraft, like NASA's Seeker, are designed to conduct low-cost in-space inspections of other spacecraft. They typically have limited computing resources, but complex computing demands. Minimally, the inspector spacecraft (also called a "chaser") must locate a target spacecraft and maneuver around it while maintaining an understanding of its spatial pose. Seeker was a successful proof of concept for a low-cost inspector spacecraft that used an algorithm trained on just one type of spacecraft.

In the Pose Bowl Challenge, solvers helped advance this line of work by competing to develop solutions for in-space inspection. The data for this challenge were simulated images of spacecraft taken from a nearby location in space, as if from the perspective of a chaser spacecraft. Images were created using the open-source 3D software Blender using models of representative target host spacecraft against simulated backgrounds.

In the Object Detection track, solvers developed solutions to detect generic target spacecraft (instead of just one type) from images of them in taken space. Challenge labels were the bounding box coordinates for the spacecraft in an image.

The silhouette of a spacecraft against reflected light from the Earth, with a green box around it indicating an object detection annotation.
Example of labeled data in the Object Detection track.


In the Pose Estimation track, solvers worked to determine the relative pose of the inspector spacecraft camera across a sequence of images of generic target spacecraft. Challenge labels were the transformation vector (x, y, z, qw, qx, qy, qz) required to return the chaser spacecraft to its initial position in the chain.

Illustration of the relative pose target in the Pose Estimation track. A chaser that moves from (1) to (2) has a relative pose rotation defined by (3) and translation defined by (4).


In addition, the top three solutions from both tracks were tested on the actual hardware used on chasers (UP-CHT01-A20-0464-A11). Real Hardware Speed Bonus prizes were given to the prize-winning solutions from each track that made the fastest inferences on a representative test.

Solution constraints

To match the inherent constraints of in-space inspection, solvers in both challenge tracks were required to develop models that would be performant enough under real conditions on a highly constrained compute hardware.

Solvers' solutions could not use GPUs and were limited to using just 3 cores and 4 GB of RAM. They could not process images in parallel, as to match the conditions of receiving imagery from a real camera. On top of these constraints, models had to be fast enough for real guidance, navigation, and control needs—about 1 second per image in the Detection track and 5 seconds per image in the Pose Estimation track.

Results

This challenge drew 836 participants from 87 countries. Collectively, participants submitted 1,625 solutions. All submitted solutions were automatically evaluated in two private test sets. One test set was used to calculate a public leaderboard score that participants saw during the challenge. The other test set was used to calculate a private leaderboard score that was revealed at the end of the challenge and determined final prize rankings.

In the Object Detection track, nearly 100 solutions outperformed the benchmark model. The top scoring model achieved a Jaccard index of 0.93; the predicted bounding boxes produced by the winning model had a 93% overlap with the actual bounding boxes. The top object detection solutions converged on YOLOv8 as the best approach for low-resource object detection. Two of the three winning teams generated new synthetic images for training, which improved final models.

The Pose Estimation track turned out to be incredibly difficult, severely challenging solvers in building accurate models. Ultimately, solvers were only able to make limited progress during the competition. Only 8 teams were able to beat the benchmark error score of 2.0 for predicting a stationary spacecraft, with the top solution managing to score 1.90. Top solutions explored some interesting directions. The first- and third-place solutions leveraged classical computer vision feature matching techniques as part of their models, while the second-place solution developed a Siamese network model with a metric learning approach.

The winners' full submissions are available in an open-source Github repository for anyone to learn from.

Meet the winners - Object Detection

Some responses have been edited for conciseness. Each team's full, unedited writeup is linked below.

Ammar Ali, Jaafar Mahmoud, and Alexander Chigorin

Place: 1st

Prize: $5,000

Team name: Polynome Team

Usernames: Ammarali32, Jaafar, aachigorin

Background:

Ammar Ali: I am currently pursuing my PhD degree in Applied Computer Science at ITMO University, and I am in my third year of the program. In addition to my academic pursuits, I hold a senior researcher position at Polynome , where I focus on the development of multilingual large language models for rare languages, including Arabic. My expertise encompasses both computer vision and natural language processing.

Jaafar Mahmoud: I am a researcher and developer specializing in robotic vision. I hold an MSc in Intellectual Robotics and am expected to defend my PhD this year, focusing on robust localization. With around 5 years of professional experience, my expertise lies primarily in 3D perception, localization and mapping for robots.

Alexander Chigorin: As the Research Director at VisionLabs UAE, I am responsible for overseeing multiple research initiatives. My primary focus is on advancing the core metrics of our projects, consistently achieving results that are on par with or surpass the current state-of-the-art (SOTA). I actively work towards integrating these improvements into our products. My primary areas of research include object detection and human pose estimation.

Summary of approach:

Diagram of the 1st place model in the Object Detection track.
Diagram of the inference pipeline implemented by the 1st place Object Detection solution. Image courtesy of Polynome team.


Our proposed solution primarily uses Yolov8, with three distinct iterations (nano, small, and medium) that have been meticulously trained. We used a sequential ensemble strategy, taking into account two crucial factors:

  1. The confidence threshold associated with the identified spaceship.
  2. The unique condition of the competition data, which ensures that each image contains precisely one spaceship, neither more nor less.

The model first identifies the region of the object, even if the mAP is low. Once the region is detected and cropped, we employ a refiner model (Yolov8m) that meticulously refines the detected object. This refiner has undergone rigorous training in two stages:

  1. The initial stage involved training on approximately 300,000 synthetic images. The background for these images was generated using a diffusion model, while the spacecraft models were sourced from the provided no-background images.
  2. The second stage involved fine-tuning the refiner on the competition data with slightly less intense augmentations. This process ensured that the refiner model was optimized for the specific conditions of the competition.

Check out Polynome team's full write-up and solution in the challenge winners repository.

Dung Nguyen Ba

Place: 2nd

Bonus prize: Real hardware speed bonus

Prize total: $5,000

Username: dungnb

Social media: /in/dungnb1333/

Background:

I am a Principal AI Engineer/Tech Lead at FPT Smart Cloud, Vietnam. I have 8 years of experience in deep learning and computer vision, 1 year of experience in embedded software development and 5 years of experience in android application development. I am also a Kaggle grandmaster (6 gold medals, 4 wins).

Summary of approach:

I chose YOLOv8s 1280 for its balance of runtime constraints and high accuracy. The key of my solution were synthetic data generation, external data, and post-processing.

Synthetic Data Generation: I created 93,778 synthetic images by overlaying the no-background spacecraft images on various backgrounds with transformations. Five types of synthetic data were generated, varying spacecraft size and positioning, and adding fake antennas.

External Data: To enhance model generality, I incorporated 3,100 satellite images with masks from a CVPR paper.

Augmentation Techniques: Augmentation techniques included random and median blurs, CLAHE, grayscale, flips, brightness/contrast adjustments, and mosaic.

Post-processing: Post-processing involved selecting the highest confidence bounding boxes and padding input images if max(box_width/image_width, box_height/image_height) > 0.7 my solution would pad the input image 150px. From my experience in previous object detection competitions, YOLO is very bad with large objects.

Check out Dung Nguyen Ba's full write-up and solution in the challenge winners repository.

Ashish Kumar

Place: 3rd

Prize: $2,000

Username: agastya

Background:

I am a data science enthusiast and deep learning practitioner who primarily competes in data science competitions.

Summary of approach:

My solution is a single YOLOv8s model trained on an image size of 1280 for 30 epochs. YOLOv8s provides the best tradeoff between leaderboard score and inference time. I initially started the competition with YOLOv8n and an image size of 640, but the performance was very poor and the inference time was around 105 minutes. While reading the YOLOv8 documentation, I came across the OpenVINO section where the author claims a 3x increase in inference speed. I submitted the YOLOv8n model and it finished in just 30 minutes. Then, I tried YOLOv8s with an image size of 1280 and performed some manual hyperparameter tuning, which improved performance. Finally, I trained the model on the full dataset which further increased the score.

Check out Ashish Kumar's full write-up and solution in the challenge winners repository.

Meet the winners - Pose Estimation Track

Some responses have been edited for conciseness. Each team's full, unedited writeup is linked below.

Xindi Liu

Place: 1st

Prize: $10,000

Username: dylanliu

Background:

My real name is 刘欣迪 (Xindi Liu), but I usually use my online English name Dylan Liu. I’m a freelance programmer (AI related) with 6 years of experience. One of my main incomes now is prizes from data science competition platforms.

Summary of approach:

I first used opencv to extract rotation angles between images, that is, convert the image sequences into rotation sequences. The rotation sequences are then used as the feature input of the decoder part of a T5 model to train the T5 decoder with a custom loss that is the same as the official performance metric. During the training process I used a data augmentation method that disrupts the input sequences.

Extracting image features directly through deep learning will lead to serious over-fitting (because there are very few types of spacecraft), and the requirements for equipment are very high, so I converted the image sequence into the rotation sequences. The T5 model’s decoder part in the generation mode only takes one piece of feature input at a time, and the model will not use subsequent data, which perfectly met the requirements of this competition. Since the movement trajectory of the camera is relatively random, I shuffled the sequence order as a method of data augmentation.

Check out Xindi Liu's full write-up and solution in the challenge winners repository.

Ioannis Nasios

Place: 2nd

Prize: $8,000

Username: ouranos

Social media: /in/ioannis-nasios-58b543b0/

Background:

I am a senior data scientist at Nodalpoint Systems in Athens, Greece. I am a geologist and an oceanographer by education turned to data science through online courses and by taking part in numerous machine learning competitions.

Summary of approach:

My solution uses a small object detection model (yolo8n) trained on the Detection track dataset and then a Siamese model with EfficientNetB0 backbone.

Check out Ioannis Nasios's full write-up and solution in the challenge winners repository

Stepan Konev and Yuriy Biktairov

Place: 3rd

Bonus prize: Real hardware speed bonus

Prize total: $10,000

Team name: OrbitSpinnersChallengeWinners

Usernames: sdrnr, ybiktairov

Background:

Stepan Konev: I work as a Machine Learning Engineer developing recommender systems at scale. I used to work as a researcher-developer developing motion prediction module for self-driving cars and delivery robots.

Yuriy Biktairov: I am a computer science PhD student at the University of Southern California. My research primarily focuses on neural network verification techniques and motion prediction for autonomy.

Summary of approach:

Our solution is based on classic computer vision techniques and includes 3 major steps. First, we match visual features between a given target image and the base image (the first one). Given a set of matches, we recover the relative pose using a variation of the RANSAC algorithm. Finally, we validate the resulting pose and fallback to a heuristic prediction in case the reconstructed result is deemed unrealistic.


Thanks to all the challenge participants and to our winners! And thank you to NASA for sponsoring this challenge!


Thumbnail and banner image is an example image used in the Pose Estimation track. Image courtesy of NASA.

Stay updated

Join our newsletter or follow us for the latest on our social impact projects, data science competitions and open source work.

There was a problem. Please try again.
Subscribe successful!
Protected by reCAPTCHA. The Google Privacy Policy and Terms of Service apply.

Latest posts

All posts

winners

Meet the winners of Phase 2 of the PREPARE Challenge

Learn about how winners detected cognitive decline using speech recordings and social determinants of health survey data

resources

Open-source packages for using speech data in ML

Overview of key open-source packages for extracting features from voice data to support ML applications

tutorial

Getting started with LLMs: a benchmark for the 'What's Up, Docs?' challenge

An introduction to using large language models via the benchmark to a document summarization challenge.

winners

Meet the Winners of the Goodnight Moon, Hello Early Literacy Screening Challenge

Learn about the results and winning methods from the early literacy screening challenge.

resources

Where to find a data job for a good cause

Finding data jobs for good causes can be difficult. Learn strategies, job lists, and tips to find organizations with open positions working on causes you care about.

winners

Meet the Winners of the Youth Mental Health Narratives Challenge

Learn about the winning solutions from the Youth Mental Health Challenge Automated Abstraction and Novel Variables Tracks

winners

Meet the winners of the Forecast and Final Prize Stages of the Water Supply Forecast Rodeo

Learn about the winners and winning solutions from the final stages of the Water Supply Forecast Rodeo.

insights

10 takeaways from 10 years of data science for social good

This year DrivenData celebrates our 10th birthday! We've spent the past decade working to use data science and AI for social good. Here are some lessons we've learned along the way.

tutorial

Goodnight Moon, Hello Early Literacy Screening Benchmark

In this guest post from the MIT Gabrieli Lab, we'll show you how to get started with the literacy screening challenge!

tutorial

Youth Mental Health: Automated Abstraction Benchmark

Learn how to process text narratives using open-source LLMs for the Youth Mental Health: Automated Abstraction challenge

winners

Meet the winners of Phase 1 of the PREPARE Challenge

Learn about the top datasets sourced for Phase 1 of the PREPARE Challenge.

resources

Teaching with DrivenData Competitions

Inspiration and resources for teaching students data science, machine learning, and AI skills with DrivenData competitions.

winners

Meet the winners of the Water Supply Forecast Rodeo Hindcast Stage

Learn about the winning models for forecasting seasonal water supply from the first stage of the Water Supply Forecast Rodeo.

tools

Cookiecutter Data Science V2

Announcing the V2 release of Cookiecutter Data Science, the most widely adopted data science project template.

resources

How to make data science projects more open and inclusive

Key practices from the field of open science for making data science work more transparent, inclusive, and equitable.

winners

Meet the winners of the Kelp Wanted challenge

Dive into the solutions from the super segmenters who best detected kelp in Landsat imagery!

winners

Meet the winners of the SNOMED CT Entity Linking Challenge

Meet the winners with the best systems for detecting clinical terms in medical notes.

winners

Meet the winners of the Pale Blue Dot challenge

Learn about the top visuals created for the Pale Blue Dot: Visualization Challenge and the solvers behind them.

tutorial

NASA Pose Bowl - Benchmark

An introduction to the NASA Pose Bowl competition, with a benchmark solution for the object detection track

tutorial

SNOMED CT Entity Linking Challenge - Benchmark

In this guest post from Veratai, we'll help you get started with the SNOMED CT Entity Linking Challenge!

Work with us to build a better world

Learn more about how our team is bringing the transformative power of data science and AI to organizations tackling the world's biggest challenges.