Sample Projects

Automating wildlife identification for research and conservation

Partners: Max Planck Institute for Evolutionary Anthropology, Arcus Foundation

Detected wildlife in video footage—automatically and at scale—by running a global algorithm development challenge and building an open source application with the winning solution (Project Zamba).

Approaches include: Deep learning, computer vision, transfer learning, data science competition, crowdsourced data annotations, open source software

Project Zamba blog


Analyzing satellite images to serve smallholder farmers

Partners: FarmDrive, The Impact Lab, The World Bank

Inferred information about what farmers are growing using daily satellite images, at a fraction of the ongoing cost of collecting this information in person, to promote financial inclusion through services like input loans or crop insurance.

Approaches include: Deep learning, computer vision, transfer learning, public playbook


Illuminating mobile money experiences in Tanzania

Partners: IDEO.org

Analyzed millions of mobile money records to uncover patterns in behavior, and then combined these insights with human-centered design to shape new approaches to delivering mobile money to low-income populations in Tanzania.

Approaches include: Human-centered design + data science, exploratory analysis, interactive visualization, rapid prototyping

case study


Tracking attacks on health care in Ukraine

Partners: Insecurity Insight, Physicians for Human Rights

Built a real-time, interactive map to visualize attacks on the Ukrainian health care system since the Russian invasion began in February of 2022. The map will support partner efforts to provide aid, hold aggressors accountable in court, and increase public awareness.

Approaches include: Interactive visualization, open data, geospatial data, production web application

case study Explore the map


Mapping fair trade products from source to shelf

Partners: Fair Trade USA

Visualized the flow of fair trade coffee products from the farms where they are grown to the stores where they are sold, connecting the nodes in supply chain transactions and increasing transparency for customers and auditors.

Approaches include: Interactive dashboarding, GIS analysis, Tableau

case study


Matching students with schools where they are likely to succeed

Partners: Data science company foundation

Used machine learning to match students with higher education programs where they are more likely to get in and graduate based on their unique profile. Focused on serving students from backgrounds traditionally less likely to attend college or apply for more competitive programs.

Approaches include: Recommender systems, predictive modeling, software engineering


Developing performance indicators and repayment models in off-grid solar

Partners: The World Bank, Angaza, GOGLA, Lighting Global

Analyzed repayment behaviors across dozens of pay-as-you-go (PAYG) solar energy companies serving off-grid populations throughout Africa; developed key performance indicators (KPIs) to facilitate standardized measurement and reporting for PAYG portfolios.

Approaches include: Predictive modeling, exploratory analytics, open source software, key performance indicators (KPIs), public-private partnerships

case study


Modeling patient pathways through hospitals

Partners: Haystack Informatics

Mapped out the probabilistic patient journeys through hospitals based on tens of thousands of patient experiences, giving hospitals a better view into the timing of the activities in their departments and how they relate to operational efficiency.

Approaches include: Predictive modeling, activity-based costing, Spark, production web application


Surfacing signals from chat messages with plant doctors

Partners: CABI Plantwise

Automated recognition of agricultural entities (such as crops, pests, diseases, and chemicals) in WhatsApp and Telegram messages among plant doctors, enabling new ways to surface emerging trends and improve science-based guidance for smallholder farmers.

Approaches include: Natural language processing (NLP), named-entity recognition (NER), fuzzy matching, human-in-the-loop data annotation


Predicting public health risks from restaurant reviews

Partners: Yelp, Harvard University, City of Boston

Flagged public health risks at restaurants by combining Yelp reviews with open city data on past inspections. An algorithmic approach discovers 25% more violations with the same number of inspections.

Approaches include: Machine learning challenge, natural language processing (NLP), open data, alternative data sources

case study blog


Smart auto-tagging of K-12 school spending

Partners: Education Resource Strategies

Built algorithms that put apples-to-apples labels on school budget line items so that districts understand how their spending stacks up and where they can improve, saving months of manual processing each year.

Approaches include: Natural language processing (NLP), machine learning challenge, Excel tooling, ranked prioritization for manual follow-up


Building data tools to fight human trafficking in Nepal

Partners: Love Justice

Aided anti-trafficking efforts at border crossings and airports by combining data across locations and surfacing insights that give interviewers greater intelligence about the right questions to ask and how to direct them.

Approaches include: Data entry user experience design, data repository, GIS analysis, dynamic dashboard


Putting AI into the hands of lung cancer clinicians

Partners: GO2 Foundation for Lung Cancer

Translated advances in machine learning research to practical software for clinical settings, building an open source application through a new kind of data challenge.

Approaches include: Data challenge, deep learning, open source software, computer vision, predictive modeling, computer-aided diagnosis


Driving data education through custom competitions

Partners: Microsoft

Developed online, white-label data science competitions for students to synthesize their learnings and test their skills on applied challenges. Each capstone features a real-world dataset that focuses on an important issue in the social sector.

Approaches include: Private data challenge, regression analysis, predictive modeling, data science education