Teaching with DrivenData Competitions

Inspiration and resources for teaching students data science, machine learning, and AI skills with DrivenData competitions.

Hannah Moshontz
Senior Program Manager

Machine learning competitions offer rich opportunities for learning and teaching. Competitions provide an experiential learning environment, featuring a motivating problem, a clear objective, access to all necessary materials and tools, and iterative feedback.

As a result, we often see competitions used by instructors to build and demonstrate applied data skills. In fact, DrivenData competitions were first run by graduate students, partly as a way to give more learners ways to connect with social impact data problems.

DrivenData competitions are great for teaching because they are:

  • Cumulative: Students use many different skills and learnings
  • Applied: Real-world datasets and compelling applications
  • Rigorous: Built on industry best practices
  • Fair: Performance is transparently and objectively measured
  • Scalable: Platform easily accommodates many students
  • Fun: Live contest tracking self and peer performance

In this blog post, we offer tips and resources for teaching with DrivenData machine learning competitions. This post includes:

  • An overview of the types of competitions available for educational use
  • Examples of how competitions have been used to support learning
  • Some consolidated resources to get started

Types of DrivenData Competitions Useful for Teaching


There are over 65 practice, active, and completed competitions on drivendata.org that you can explore on the competition search page. Use the tabs on the left to filter by impact domain (e.g., climate, science, privacy, health), difficulty, and open data or solutions.

Three types of DrivenData competition are particularly useful for instruction, as described below.

Type Summary Teaching opportunities
Practice competitions Open competitions with no prize offerings Timing is flexible, and data can be used to design any exercise or assignment
Active prize competitions Open prize competitions with a live leaderboard Students can participate in a real prize competition alongside experts, and can learn from the winning strategies after the competition closes
Closed prize competitions with open data Closed prize competitions with data available for people to play with, and winning solutions documented Timing is flexible, there is a large variety of topics and data modalities, and data can be used to design any exercise or assignment

These competitions include a variety of data types and problem complexity, making them appropriate for a range of courses and instruction levels. In the next section we share some examples of different ways instructors can draw on these resources to support learners.

Examples of Use


Instructors can incorporate DrivenData competitions or competition materials into their instruction in many different ways. We give recommendations and examples below, with instructors of college or graduate level data science or applied statistics courses in mind. However, these examples can be adapted to other skill levels and course topics through assignment design or selection of a competition. For instance, the PREPARE Challenge could be fodder for behavioral or health sciences students, and the Pale Blue Dot Visualization Challenge could be great for environmental studies students.

Participate in an Active Competition

As part of a course-long project or extra credit opportunity, students can participate in a competition individually or in groups. This is the simplest option for instructors because the competition provides all the context and instructions that students need. Optionally, students can write a summary of their participation and its relevance to course concepts. Having students participate in an active competition works particularly well when the submission period of the competition begins just before the start of the course. On request, we can make a custom leaderboard just for your class or for different sections of your class.

DrivenData Competitions to use: Any active prize or practice competition
Skill options: EDA, feature engineering, troubleshooting, model development, model interpretation and description, documentation, technical writing, visualization
Assessment: Credit/no credit, or grades based on leaderboard performance, quality of code, or a customized final report
Variations: Require a particular code structure, like Cookiecutter Data Science, to encourage best practices in code organization and reproducibility. For less advanced classes, provide or use existing starter code and focus on specific skills like feature engineering. For more advanced classes, place additional constraints or conditions on student solutions (e.g., students must use a particular algorithm type or must make inferences quickly).
Difficulty: Intermediate or Advanced, but flexible depending on the selected competition and design of the assignment

DrivenData competition search page.
Explore active prize or practice competitions on the competition search page.


Use Open Data from Closed Prize Competitions

As part of a problem set, in-class demonstration, exam, or other project assignment that requires model development, you can use the open data from a closed prize competition. Since closed competitions do not offer automatic scoring of predictions and models, this option works best when you have an assignment idea in need of cleaned and well-documented data.

There are open datasets covering a variety of modalities and topics. For example, with open data from closed DrivenData competitions, students can train models to identify the composition of Martian rocks and soil samples, estimate above-ground forest biomass from satellite images, identify individual beluga whales from overhead images, classify animals in camera trap images, or forecast changes in Earth’s magnetic field from solar wind measurements.

DrivenData Competitions to use: Any competition with open data
Skill options: Flexible to fit a huge range of data science or statistical skills
Assessment: Grades can be based on model performance, or a submitted report or presentation.
Variations: For practice with data wrangling, students can find, download, and prepare data for analysis as part of the assignment.
Difficulty: All skill levels.

Analyze Winning Solutions from Closed Prize Competitions

As part of an in-class or take home exercise for individuals or groups, students can review and analyze the winning solution code from a competition. Students can pick the competition, or be assigned one. The content of their review can be tailored to course goals and data availability, but could involve code review, hunting for errors, reproducing winning model predictions, mere summarization of a single solution, compare and contrast of different approaches across solutions, or a description of how to combine or ensemble the approaches from the top solutions.

DrivenData Competitions to use: Any closed competition with announced winners
Skill options: Code literacy, troubleshooting, technical writing, comparative analysis, presentation
Assessment: Grades based on the quality of the review or assessment
Variations: Have students use the winners' written reports rather than the code itself, have students attempt to solve the competition problem themselves before doing this exercise.
Difficulty: Beginner or Intermediate

Consider Data Science Ethics

As part of a take home exercise or project, students can use a competition problem or a prize-winning model to practice applying ethical considerations. For example, students can explore the different consequences of false positives or false negatives in the context of algorithms for detecting harmful algal blooms from the Tick Tick Bloom challenge, or can consider the issues that would be important to monitor for after deploying an algorithm that estimates seasonal snowpack in the Western U.S. from the Snowcast Showdown.

DrivenData Competitions to use: Any competition
Skill options: Ethics
Assessment: Grades based on the quality of the paper or report
Variations: For an easier in-class version, just use the problem to abstractly reflect on ethical challenges. For a more advanced version, have students apply Deon to their own solutions.
Difficulty: All skill levels

Resources

Category Resource Description
Finding competitions Active prize competitions Ongoing competitions with prizes
Practice competitions Open competitions for practice, no prizes
Competitions with open data Competitions featuring publicly available datasets
Competition-specific resources Winner's announcement blog posts Blog posts titled "Meet the Winners of the X Challenge" for completed competitions
Winning solutions and write ups GitHub repository with winning solutions for completed competitions
Forum Q&A about competitions, organized by competition categories
Community code Feature for sharing notebooks and code snippets, accessible from competition page menu
Topic and data guides Beginner's Guide to Satellite Data Introduction to working with satellite imagery
Open data and AI applications for climate action Resources for AI applications in climate action
Open earth observation data related to select SDGs Data sources for food access, clean water, and climate action
Ethics checklist for data science Ethical considerations in data science projects
Practical guidance for reproducible data science Template and best practices for reproducible data science

Closing Thoughts

DrivenData competitions offer educators practical opportunities to engage students in real-world data science problems. Teachers can use competitions in many different ways in their course design, including direct or skills-focused participation in competitions, solution walkthroughs and reproductions or extensions, and assignments that require students to apply important concepts, like ethical considerations, to different kinds of problems or algorithms. Competitions can be used to serve many different course goals and student skill levels. With the right guidance and resources, students can make significant contributions to solving important social issues with data science.

If you’d like advice on teaching a particular skill with DrivenData competitions, or if you want to create a custom leaderboard for your class, please get in touch!

Thumbnail and banner image by Sylvia Yang on Unsplash.

Stay updated

Join our newsletter or follow us for the latest on our social impact projects, data science competitions and open source work.

There was a problem. Please try again.
Subscribe successful!
Protected by reCAPTCHA. The Google Privacy Policy and Terms of Service apply.

Latest posts

All posts

insights

Life beyond the leaderboard

What happens to winning solutions after a machine learning competition?

winners

Meet the winners of Phase 2 of the PREPARE Challenge

Learn about how winners detected cognitive decline using speech recordings and social determinants of health survey data

resources

Open-source packages for using speech data in ML

Overview of key open-source packages for extracting features from voice data to support ML applications

tutorial

Getting started with LLMs: a benchmark for the 'What's Up, Docs?' challenge

An introduction to using large language models via the benchmark to a document summarization challenge.

winners

Meet the Winners of the Goodnight Moon, Hello Early Literacy Screening Challenge

Learn about the results and winning methods from the early literacy screening challenge.

resources

Where to find a data job for a good cause

Finding data jobs for good causes can be difficult. Learn strategies, job lists, and tips to find organizations with open positions working on causes you care about.

winners

Meet the Winners of the Youth Mental Health Narratives Challenge

Learn about the winning solutions from the Youth Mental Health Challenge Automated Abstraction and Novel Variables Tracks

winners

Meet the winners of the Forecast and Final Prize Stages of the Water Supply Forecast Rodeo

Learn about the winners and winning solutions from the final stages of the Water Supply Forecast Rodeo.

insights

10 takeaways from 10 years of data science for social good

This year DrivenData celebrates our 10th birthday! We've spent the past decade working to use data science and AI for social good. Here are some lessons we've learned along the way.

tutorial

Goodnight Moon, Hello Early Literacy Screening Benchmark

In this guest post from the MIT Gabrieli Lab, we'll show you how to get started with the literacy screening challenge!

tutorial

Youth Mental Health: Automated Abstraction Benchmark

Learn how to process text narratives using open-source LLMs for the Youth Mental Health: Automated Abstraction challenge

winners

Meet the winners of Phase 1 of the PREPARE Challenge

Learn about the top datasets sourced for Phase 1 of the PREPARE Challenge.

winners

Meet the winners of the Pose Bowl challenge

Learn about the top solutions submitted for the Pose Bowl: Spacecraft Detection and Pose Estimation Challenge.

winners

Meet the winners of the Water Supply Forecast Rodeo Hindcast Stage

Learn about the winning models for forecasting seasonal water supply from the first stage of the Water Supply Forecast Rodeo.

tools

Cookiecutter Data Science V2

Announcing the V2 release of Cookiecutter Data Science, the most widely adopted data science project template.

resources

How to make data science projects more open and inclusive

Key practices from the field of open science for making data science work more transparent, inclusive, and equitable.

winners

Meet the winners of the Kelp Wanted challenge

Dive into the solutions from the super segmenters who best detected kelp in Landsat imagery!

winners

Meet the winners of the SNOMED CT Entity Linking Challenge

Meet the winners with the best systems for detecting clinical terms in medical notes.

winners

Meet the winners of the Pale Blue Dot challenge

Learn about the top visuals created for the Pale Blue Dot: Visualization Challenge and the solvers behind them.

tutorial

NASA Pose Bowl - Benchmark

An introduction to the NASA Pose Bowl competition, with a benchmark solution for the object detection track

Work with us to build a better world

Learn more about how our team is bringing the transformative power of data science and AI to organizations tackling the world's biggest challenges.