The organization¶
Science for America is a solutions incubator that brings together scientists, technologists, and cross-sector partners to develop game-changing initiatives addressing critical societal challenges. By fostering collaborations and supporting data-driven tools, they aim to drive systemic improvements in climate & energy, health & medicine, and STEM equity & education.
The challenge¶
The Integrated Postsecondary Education Data System (IPEDS) offers a wealth of comprehensive data on U.S. higher education institutions, including data on degree completions by field, race/ethnicity, gender, and institution. However, this data and the codes used to represent and classify different educational fields are spread across numerous files and formats that have changed over time, making it difficult to conduct longitudinal or comparative analyses.
While working to understand representation in STEM education, Science for America surfaced the need for a more accessible and structured way to work with IPEDS data. A streamlined solution would allow educators, advocates, and researchers to better understand who is earning STEM degrees, where, and how that has changed over time.
The approach¶
To meet this need, Science for America partnered with DrivenData to create scipeds, an open source Python library that simplifies working with IPEDS data.
scipeds provides:
- A reproducible data processing pipeline that standardizes and organizes decades of IPEDS survey data into a centralized DuckDB database.
- Python query engines that enable users to run common analyses without writing any SQL.
- Support for core IPEDS components from 1984 to 2023, including completions by field, degree type, race/ethnicity, and gender.
This library provides tools for users to quickly query data with a stable schema across years, enabling reliable, reproducible analyses.
The results¶
To make IPEDS data even more accessible to those without any programming or data analysis experience, the team built a companion interactive visualization platform that leverages the scipeds library to power real-time interactive data analysis.

The Fields page of the scipeds interactive data visualization platform.
The interactive platform contains pages with visualizations designed for specific audiences that highlight IPEDS data in different ways:
- Students and families can explore how representation varies by race, gender, and field of study to make informed decisions about where to apply.
- Faculty and administrators can benchmark their institutions against national trends and peer schools to understand disparities and opportunities for progress.
- Researchers and advocates can dive deep into custom queries using the data sandbox for exploratory analysis and hypothesis testing.
Together, these tools help a variety of users better understand the current state of higher STEM education in the U.S. and make informed decisions about their own actions.
To learn more, visit the documentation, try out the library, or explore the interactive site.