The organization¶
The World Bank works to reduce poverty and support development around the world, with a particular focus on fragile and conflict-affected situations. In Yemen, the organization supports programs for food and nutrition security in one of the world's most challenging humanitarian contexts, where traditional data collection methods are often impossible due to ongoing conflict.
The challenge¶
Yemen faces a severe humanitarian crisis, with food security significantly impacted by conflict and climate change. Understanding the evolution and current state of agriculture across the entire country—spanning over 500,000 square kilometers—is critical for informing World Bank and other donor support programs for food and nutrition security. However, agricultural data in Yemen is not readily available and is extremely difficult to collect through traditional on-the-ground surveys. Alternative approaches using remote sensing data were needed to provide timely, accurate information for food security planning and response across all 22 governorates.
The approach¶
DrivenData collaborated with The World Bank to develop an innovative machine learning approach that used publicly available remote sensing data to track agricultural trends across all of mainland Yemen over a six-year period (2019-2024). Our comprehensive methodology scaled from detailed site analysis to country-wide inference, processing millions of satellite pixels monthly.
The project began with expert remote labeling of over 210 square kilometers of high-resolution satellite imagery across eight strategically selected sites representing Yemen's diverse agro-ecological zones. Using this training data, we developed multi-temporal machine learning models that could process Sentinel-2 satellite data at 10-meter resolution across the entire country.
Our hierarchical crop classification system distinguished between many different agricultural categories including cereals, qat, coffee, orchards, fodder, greenhouses, and mixed crops, generating country-wide predictions for each year. The technical approach used multi-pixel models processing 12 months of satellite data with NDVI differences, incorporated elevation data alongside five spectral bands, and applied Google Earth Engine's pixel quality algorithms for robust cloud filtering across the massive scale of analysis.
The inference challenge was substantial: generating annual predictions for active cropland and crop types across millions of 10-meter pixels covering all agricultural areas of Yemen, then aggregating these pixel-level predictions to meaningful administrative units for policy analysis. Significant effort went into scaling inference so that the required 27 billion predictions could be performed in a reasonable amount of time.
The results¶
The country-wide analysis revealed that actively cultivated cropland in Yemen increased substantially between 2019 and 2024, with peak activity levels reaching approximately twice the total area cultivated in 2019—representing over 1.3 million hectares of active agriculture by 2024. The team successfully generated predictions across all 333 districts in Yemen, finding that agricultural activity varied significantly from year to year, with changes most strongly correlated with rainfall patterns.
The scale of insights was unprecedented for Yemen: detailed crop type maps showing that qat comprises a large share of agricultural land, cereals dominating the agricultural landscape with over 600,000 hectares in peak years, and clear regional patterns in agricultural intensity and crop selection. While conflict impacts were visible in specific locations like Al Jawf governorate, the aggregate analysis across all districts showed weaker associations between agriculture and factors like conflict severity, food security, population density, and wealth at the governorate level.

Animation of active agriculture predictions over time in the Al Jawf governorate, showing an apparent increase in actively cultivated land area as the front line of conflict moves southeast.
These findings provide crucial evidence for policy makers and humanitarian organizations working on food security in Yemen. The maps and trend analyses created through this project lay the foundation for more targeted interventions and continued monitoring of agricultural resilience in conflict-affected areas. The methodology developed can be applied to other fragile contexts where traditional agricultural data collection is challenging or impossible, potentially covering millions of square kilometers across conflict-affected regions globally.