Happy April Fools' Day from DrivenData (and our pets)

Hear from the DrivenPets about the results of our competition on the pet-productivity connection for remote workers

The DrivenPets
Guest writer

With the onset of the pandemic we experienced an exponential increase in belly scratches, cuddles, and opportunities to cause chaos. The level of accuracy achieved by the competitors will help us make sure humans continue to prioritize us over all other things in their lives.

Pets of the DrivenData team

The Challenge

On April 1st, DrivenData released the Pawsitive Predictive Values competition to explore the relationship between remote worker productivity and pet cuddles. The challenge data was collected by wearable technology monitoring human–animal distance, as well as productivity data measuring lines of code written per minute. Four DrivenData team members and their pets, shown below, agreed to be tracked as part of the competition.


Nico

Poncie

Teddy

Titus

DrivenPets in the competition dataset

But hold on, didn't the competition come out on April Fools' Day? Is that just a coincidence? We think not! Once again, we DrivenPets have played our humans like fiddles. Or balls of yarn. Or your headphones. Or your shoes. Toys!

Let's take a look at the competition training data. In the plot below, human-animal distance in meters (distance) is on the x-axis and the human's productivity in lines of code written per second (lines_per_sec) is on the y-axis.

We have a feline that this data is suspicious...let's try plotting it separately for each pet.

What an incredible coincidence that the data spells out Happy April Fools' Day! Wow, pets ARE amazing. You should probably go give your cat a treat.

Now that we've cracked the code, we can see that the test set must be the missing points in our April Fools' Day message. There are different ways to back out the linear equations for the missing pieces, but the simplest is likely trial and error guessing the parameters and then visually checking them. Discerning participants may recognize that all parameters contain the digits 41.

test["lines_per_sec"] = np.where(
    test["pet_name"].isin(["nico", "titus"]),
    0.41 * test["distance"] + 0.41,
    np.where(
        test["pet_name"] == "poncie",
        -0.041 * test["distance"] + 0.41,
        0.041 * test["distance"] + 0.41,
    ),
)

Now let's plot our predicted data.

Congrats to gpilgrim, sioHbzExH10iU, agacich, rogervrodrigues, and AmirH for solving the puzzle, approximating the lines of missing data, and beating the benchmark solution! Nico, Poncie, Teddy, and Titus look forward to seeing your work in future competitions, and will all now be returning to their naps.

We hope this challenge was a helpful reminder that plotting and visualization can uncover import structural patterns in your data that descriptive statistics alone don't fully capture. The dataset was created using drawdata.xyz, which allows flexible manual creation of a dataset based on a graph.

Titus implementing a cutting-edge convolutional neural network with a ResNet backbone.


Thanks to all the paw-ticipants! We hope you had fun celebrating April Fools' Day with us—we know we did.

Stay updated

Join our newsletter or follow us for the latest on our social impact projects, data science competitions and open source work.

There was a problem. Please try again.
Subscribe successful!
Protected by reCAPTCHA. The Google Privacy Policy and Terms of Service apply.

Latest posts

All posts

winners

Meet the winners of Phase 2 of the PREPARE Challenge

Learn about how winners detected cognitive decline using speech recordings and social determinants of health survey data

resources

Open-source packages for using speech data in ML

Overview of key open-source packages for extracting features from voice data to support ML applications

tutorial

Getting started with LLMs: a benchmark for the 'What's Up, Docs?' challenge

An introduction to using large language models via the benchmark to a document summarization challenge.

winners

Meet the Winners of the Goodnight Moon, Hello Early Literacy Screening Challenge

Learn about the results and winning methods from the early literacy screening challenge.

resources

Where to find a data job for a good cause

Finding data jobs for good causes can be difficult. Learn strategies, job lists, and tips to find organizations with open positions working on causes you care about.

winners

Meet the Winners of the Youth Mental Health Narratives Challenge

Learn about the winning solutions from the Youth Mental Health Challenge Automated Abstraction and Novel Variables Tracks

winners

Meet the winners of the Forecast and Final Prize Stages of the Water Supply Forecast Rodeo

Learn about the winners and winning solutions from the final stages of the Water Supply Forecast Rodeo.

insights

10 takeaways from 10 years of data science for social good

This year DrivenData celebrates our 10th birthday! We've spent the past decade working to use data science and AI for social good. Here are some lessons we've learned along the way.

tutorial

Goodnight Moon, Hello Early Literacy Screening Benchmark

In this guest post from the MIT Gabrieli Lab, we'll show you how to get started with the literacy screening challenge!

tutorial

Youth Mental Health: Automated Abstraction Benchmark

Learn how to process text narratives using open-source LLMs for the Youth Mental Health: Automated Abstraction challenge

winners

Meet the winners of Phase 1 of the PREPARE Challenge

Learn about the top datasets sourced for Phase 1 of the PREPARE Challenge.

resources

Teaching with DrivenData Competitions

Inspiration and resources for teaching students data science, machine learning, and AI skills with DrivenData competitions.

winners

Meet the winners of the Pose Bowl challenge

Learn about the top solutions submitted for the Pose Bowl: Spacecraft Detection and Pose Estimation Challenge.

winners

Meet the winners of the Water Supply Forecast Rodeo Hindcast Stage

Learn about the winning models for forecasting seasonal water supply from the first stage of the Water Supply Forecast Rodeo.

tools

Cookiecutter Data Science V2

Announcing the V2 release of Cookiecutter Data Science, the most widely adopted data science project template.

resources

How to make data science projects more open and inclusive

Key practices from the field of open science for making data science work more transparent, inclusive, and equitable.

winners

Meet the winners of the Kelp Wanted challenge

Dive into the solutions from the super segmenters who best detected kelp in Landsat imagery!

winners

Meet the winners of the SNOMED CT Entity Linking Challenge

Meet the winners with the best systems for detecting clinical terms in medical notes.

winners

Meet the winners of the Pale Blue Dot challenge

Learn about the top visuals created for the Pale Blue Dot: Visualization Challenge and the solvers behind them.

tutorial

NASA Pose Bowl - Benchmark

An introduction to the NASA Pose Bowl competition, with a benchmark solution for the object detection track

Work with us to build a better world

Learn more about how our team is bringing the transformative power of data science and AI to organizations tackling the world's biggest challenges.