case studies

Automating wildlife monitoring with Zamba & Zamba Cloud

DrivenData partnered with conservation researchers to create Zamba, an open-source machine learning solution that helps wildlife researchers process camera trap footage, reducing months of manual review to hours of automated analysis.

The organizations

The Pan African Programme at the Max Planck Institute for Evolutionary Anthropology aims to understand the evolutionary and ecological drivers of chimpanzee cultural and behavioral diversity. They collect systematic data on chimpanzee populations from over 40 research and conservation sites across Africa.

WILDLABS provides a platform for bringing together the world's conservation technology community to accelerate the development and deployment of conservation technologies.

The challenge

Camera traps have become essential tools for wildlife conservation, generating massive datasets that enable researchers to study animal behavior, track population changes, and assess conservation impacts. However, these motion-triggered cameras create an overwhelming volume of data—a single deployment can produce tens of thousands of videos and images, with up to 70% containing no animals at all. Manual review of this footage represents a critical bottleneck, consuming valuable researcher time and resulting in an underutilization of camera trap data.

The conservation community needed an accurate, accessible solution to identify animals in camera trap data that could work across diverse habitats and species while being usable by researchers without programming expertise or specialized computing hardware.

The approach

DrivenData collaborated with conservation researchers through a series of machine learning competitions to develop automated wildlife identification models. In the Pri-matrix Factorization Challenge, participants trained machine learning models to classify 23 species from nearly 2,000 hours of annotated camera trap footage. The follow-up Deep Chimpact Challenge focused on monocular depth estimation—inferring the distance between the animal and the camera—which is an intermediate step for estimating population abundance using distance sampling methods.

Building on the winning algorithms, DrivenData created Zamba, an open-source Python package that uses machine learning and computer vision to automatically detect and classify animals in camera trap images and videos. Video models were trained on 250,000 labeled videos from 14 countries across Central and West Africa, and image models were developed using over 15 million annotations from 7 million images across 20 global datasets.

To make these capabilities accessible to non-programmers, the team developed Zamba Cloud, a no-code web platform that runs machine learning workloads on managed cloud infrastructure.

Screenshot of Zamba Cloud showing a camera trap image on the left with the marked classification of a Puma on the right.
A screenshot of Zamba Cloud

The results

Zamba is unique in its ability to process camera trap video data. Unlike most existing tools that only support image data, Zamba enables automatic species classification for both still images and videos, as well as depth estimation from videos. Zamba also uniquely provides support for custom model training through fine-tuning. Researchers can easily adapt the base models for their specific species and habitats rather than relying solely on pretrained models. This makes Zamba useful across the wide variety of environments in which camera traps are deployed.

Moreover, Zamba Cloud's no-code custom model training capability democratizes access to sophisticated AI tools. Over 300 users from around the world have used Zamba Cloud to process more than 1.1 million videos.

By enabling conservationists to efficiently analyze large datasets and train models tailored to their specific ecological contexts, Zamba advances wildlife monitoring, research, and evidence-based conservation.

Our real-world impact

All projects
Partners: Max Planck Institute for Evolutionary Anthropology, Arcus Foundation, WILDLABS

Automating wildlife identification for research and conservation

Detected wildlife in images and videos—automatically and at scale—by building the winning algorithm from a DrivenData competition into an open source python package and a web application running models in the cloud.

Partners: CodePath

Data engineering from the ground up

Built data infrastructure to ingest, clean, integrate, and organize data across CodePath, created interactive dashboards for accurate monitoring of program trends, and provided trusted data expertise to identify and hire talent to carry the work forward.

Partners: The National Center for State Courts

Building a private LLM sandbox for NCSC

We worked with the National Center for State Courts to build an LLM chat sandbox for private usage. This sandbox allows users to experiment with LLM tools in a way that is safe, secure, and cost-effective, with specific use cases and prompts relevant to their work.

Partners: The World Bank, The Conflict and Environment Observatory

Identifying crop types using satellite imagery in Yemen

Used satellite imagery to identify crop extent, crop types and climate risks to agriculture in Yemen, informing World Bank development programs in the country after years of civil war.

Partners: Private sector, social sector

Building applied solutions with LLMs

Built solutions using LLMs for multiple real-world applications, across tasks including semantic search, summarization, named entity recognition, and multimodal analysis. Work has spanned research on state-of-the-art models tuned for specific use cases to production ready retrieval-augmented AI applications.

Partners: Bureau of Ocean Energy Management, NOAA Fisheries, Wild Me

Protecting endangered beluga whales with computer vision

Designed and administered a computer vision challenge that produced state-of-the-art machine learning models to identify and match individual endangered beluga whales from photo surveys.

Partners: EverFree

A production application to support survivors of human trafficking

Built the Freedom Lifemap platform, a digital tool designed to support survivors of human trafficking on their journey toward reintegration and independence

Partners: ReadNet

Crowdsourcing solutions for AI assisted early literacy screening

Ran a machine learning challenge to develop automatic scoring methods for audio clips from literacy screener exercises. Automated scoring can help teachers quickly and reliably identify children in need of early literacy intervention.

Partners: Science for America

Making higher education data more accessible

Created an open source Python library and interactive data visualization platform for analyzing U.S. higher education data and illuminating trends and disparities in STEM education.

Partners: IDEO.org

Illuminating mobile money experiences in Tanzania

Analyzed millions of mobile money records to uncover patterns in behavior, and then combined these insights with human-centered design to shape new approaches to delivering mobile money to low-income populations in Tanzania.

Partners: Insecurity Insight, Physicians for Human Rights

Tracking attacks on health care in Ukraine

Built a real-time, interactive map to visualize attacks on the Ukrainian health care system since the Russian invasion began in February of 2022. The map will support partner efforts to provide aid, hold aggressors accountable in court, and increase public awareness.

Partners: Wellcome

Addressing algorithmic bias in medical research

Conducted a literature review to understand the current state of bias identification & mitigation in mental health research, and synthesized recommended best practices from the field of machine learning.

Partners: CABI Plantwise

Mining chat messages with plant doctors using language models

Automated recognition of agricultural entities (such as crops, pests, diseases, and chemicals) in WhatsApp and Telegram messages among plant doctors, enabling new ways to surface emerging trends and improve science-based guidance for smallholder farmers.

Partners: NASA

Monitoring water quality from satellite imagery

Created an open-source package to detect harmful algal blooms using machine learning and satellite imagery. Included running a machine-learning competition, conducting end user interviews, and engineering a robust, deployable pipeline.

Partners: Data science company foundation

Matching students with schools where they are likely to succeed

Used machine learning to match students with higher education programs where they are more likely to get in and graduate based on their unique profile, with a focus on backgrounds traditionally less likely to attend college or apply to more competitive programs.

Partners: University of Maryland

Processing multimodal tutoring data

Built well-engineered data pipelines to extract machine learning features from audio, video and transcript data collected from online tutoring sessions, enabling a team at the University of Maryland to study how relationship-building affects student outcomes.

Partners: Fair Trade USA

Mapping fair trade products from source to shelf

Visualized the flow of fair trade coffee products from the farms where they are grown to the stores where they are sold, connecting the nodes in supply chain transactions and increasing transparency for customers and auditors.

Partners: The World Bank, Angaza, GOGLA, Lighting Global

Developing performance indicators and repayment models in off-grid solar

Analyzed repayment behaviors across dozens of pay-as-you-go (PAYG) solar energy companies serving off-grid populations throughout Africa, and developed KPIs to facilitate standardized reporting for PAYG portfolios.

Partners: Haystack Informatics

Modeling patient pathways through hospitals

Mapped out the probabilistic patient journeys through hospitals based on tens of thousands of patient experiences, giving hospitals a better view into the timing of the activities in their departments and how they relate to operational efficiency.

Partners: Education Resource Strategies

Smart auto-tagging of K-12 school spending

Built algorithms that put apples-to-apples labels on school budget line items so that districts understand how their spending stacks up and where they can improve, saving months of manual processing each year.

Work with us to build a better world

Learn more about how our team is bringing the transformative power of data science and AI to organizations tackling the world's biggest challenges.