blog

Introducing DrivenData

As we begin launching our first competitions, we thought it would be a good idea to lay out what exactly we're trying to do and why.

If your goal is to change the future, it helps to have really good predictions about what that future looks like.

And there are a lot of people interested in changing the future. Amazon wants to increase the number of goods you order through their site, so they predict which ones you might want to buy next and when. Twitter wants to boost your use of their platform, so they predict which tweets you will ignore and which you will engage with. Facebook and Google want to increase the number of ads you click …

As we begin launching our first competitions, we thought it would be a good idea to lay out what exactly we're trying to do and why.

If your goal is to change the future, it helps to have really good predictions about what that future looks like.

And there are a lot of people interested in changing the future. Amazon wants to increase the number of goods you order through their site, so they predict which ones you might want to buy next and when. Twitter wants to boost your use of their platform, so they predict which tweets you will ignore and which you will engage with. Facebook and Google want to increase the number of ads you click on their sites, so they predict your personal click-through behavior. They have gotten very good at making these predictions.

But there are many other reasons to want to change the future. Educators want to increase the number of students graduating high school. Health workers want to improve the overall health of a population at a sustainable cost. Microlenders want to give more individuals in the developing world a chance to pursue their dreams without incurring default. Conservationists want to curb our energy usage without hampering productivity. Governments want to prevent fires from destroying lives and property.

This is where we come in. In the quest to make these difficult but important changes, we must arm ourselves with state-of-the-art predictions. Predict which students are likely to drop out before graduation back when they are in junior high, so teachers can intervene earlier. Predict which individuals will be able to repay their microloans, even when banks have shut them out in the past. Predict where fires are more likely to break out, and get there first.

Source: flickr user thenationalguard
A different kind of big data conflagration.

In today's world, the people who can make these predictions better than anyone else are data scientists. They are the modern day fortune tellers, but instead of crystal balls they wield datasets. Armed with skills in statistics and computer science, the data scientist takes large data sets and builds smart, creative, flexible models for what is likely to happen next. In 2011, there was more data produced than in all the previous years of human history combined. The quantity and variety of data available to us is exploding, and the individuals who can manipulate and illuminate these data have incredible value to offer.

At DrivenData, we want to bring cutting-edge practices in data science and crowdsourcing to some of the world's biggest social challenges and the organizations taking them on. We host online challenges, usually lasting 2-3 months, where a global community of data scientists competes to come up with the best statistical model for difficult predictive problems that make a difference.

Just like every major corporation today, nonprofits and NGOs have more data than ever before. And just like those corporations, they are trying to figure out how to make the best use of their data. We work with mission-driven organizations to identify specific predictive questions that they care about answering and can use their data to tackle.

Source: flickr user plenty
In predictive modeling, trying many different approaches is crucial.

Then we host the online competitions, where experts from around the world vie to come up with the best solution. Some competitors are experienced data scientists in the private sector, analyzing corporate data by day, saving the world by night, and testing their mettle on complex questions of impact. Others are smart, sophisticated students and researchers looking to hone their skills on real-world datasets and real-world problems. Still more have extensive experience with social sector data and want to bring their expertise to bear on new, meaningful challenges - with immediate feedback on how well their solution performs.

Like any data competition platform, we want to harness the power of crowds combined with the increasing prevalence of large, relevant datasets. Unlike other data competition platforms, our primary goal is to create actual, measurable, lasting positive change in the world with our competitions. At the end of each challenge, we work with the sponsoring organization to integrate the winning solutions, giving them the tools to drive real improvements in their impact.

"The best minds of my generation are thinking about how to make people click ads. That sucks." — Jeff Hammerbacher, 2011

Equipped with good predictions, we have opportunities to change the course of our planet that we've never had before. We want to tackle the hairiest, most challenging, and most meaningful problems in the world today. We are building a community of data experts that can take them on. This is the new frontier of social good.

Getting involved

We are launching soon and we want you to join us!

If you want to get updates about our launch this fall with exciting, real competitions, please sign up for our mailing list here and follow us on Twitter: @drivendataorg.

If you are a data scientist, feel free to create an account and start playing with our first sandbox competitions.

If you are a nonprofit or public sector organization, and want to squeeze every drop of mission effectiveness out of your data, check out the info on our site and let us know!



Cheers,

Peter, Greg, and Isaac
The DrivenData Team

Stay updated

Join our newsletter or follow us for the latest on our social impact projects, data science competitions and open source work.

There was a problem. Please try again.
Subscribe successful!
Protected by reCAPTCHA. The Google Privacy Policy and Terms of Service apply.

Latest posts

All posts

winners

Meet the winners of Phase 2 of the PREPARE Challenge

Learn about how winners detected cognitive decline using speech recordings and social determinants of health survey data

resources

Open-source packages for using speech data in ML

Overview of key open-source packages for extracting features from voice data to support ML applications

tutorial

Getting started with LLMs: a benchmark for the 'What's Up, Docs?' challenge

An introduction to using large language models via the benchmark to a document summarization challenge.

winners

Meet the Winners of the Goodnight Moon, Hello Early Literacy Screening Challenge

Learn about the results and winning methods from the early literacy screening challenge.

resources

Where to find a data job for a good cause

Finding data jobs for good causes can be difficult. Learn strategies, job lists, and tips to find organizations with open positions working on causes you care about.

winners

Meet the Winners of the Youth Mental Health Narratives Challenge

Learn about the winning solutions from the Youth Mental Health Challenge Automated Abstraction and Novel Variables Tracks

winners

Meet the winners of the Forecast and Final Prize Stages of the Water Supply Forecast Rodeo

Learn about the winners and winning solutions from the final stages of the Water Supply Forecast Rodeo.

insights

10 takeaways from 10 years of data science for social good

This year DrivenData celebrates our 10th birthday! We've spent the past decade working to use data science and AI for social good. Here are some lessons we've learned along the way.

tutorial

Goodnight Moon, Hello Early Literacy Screening Benchmark

In this guest post from the MIT Gabrieli Lab, we'll show you how to get started with the literacy screening challenge!

tutorial

Youth Mental Health: Automated Abstraction Benchmark

Learn how to process text narratives using open-source LLMs for the Youth Mental Health: Automated Abstraction challenge

winners

Meet the winners of Phase 1 of the PREPARE Challenge

Learn about the top datasets sourced for Phase 1 of the PREPARE Challenge.

resources

Teaching with DrivenData Competitions

Inspiration and resources for teaching students data science, machine learning, and AI skills with DrivenData competitions.

winners

Meet the winners of the Pose Bowl challenge

Learn about the top solutions submitted for the Pose Bowl: Spacecraft Detection and Pose Estimation Challenge.

winners

Meet the winners of the Water Supply Forecast Rodeo Hindcast Stage

Learn about the winning models for forecasting seasonal water supply from the first stage of the Water Supply Forecast Rodeo.

tools

Cookiecutter Data Science V2

Announcing the V2 release of Cookiecutter Data Science, the most widely adopted data science project template.

resources

How to make data science projects more open and inclusive

Key practices from the field of open science for making data science work more transparent, inclusive, and equitable.

winners

Meet the winners of the Kelp Wanted challenge

Dive into the solutions from the super segmenters who best detected kelp in Landsat imagery!

winners

Meet the winners of the SNOMED CT Entity Linking Challenge

Meet the winners with the best systems for detecting clinical terms in medical notes.

winners

Meet the winners of the Pale Blue Dot challenge

Learn about the top visuals created for the Pale Blue Dot: Visualization Challenge and the solvers behind them.

tutorial

NASA Pose Bowl - Benchmark

An introduction to the NASA Pose Bowl competition, with a benchmark solution for the object detection track

Work with us to build a better world

Learn more about how our team is bringing the transformative power of data science and AI to organizations tackling the world's biggest challenges.