
The organization¶
CodePath is a national nonprofit dedicated to transforming computer science education for first-generation and low-income students. By offering no-cost technical courses, career support, and a robust community network, CodePath equips college students with the skills and experience necessary to launch thriving careers in tech. The organization partners with top universities and major tech employers to bridge the gap between traditional education and industry expectations.
The challenge¶
As CodePath expanded its reach, its evolving data needs prompted a closer look at how to better support growth by engineering more robust data infrastructure. It’s a scenario that is familiar to ambitious and rapidly growing organizations: as operational data grows across a range of platforms, managing and unifying this information becomes increasingly complex. Definitions and reporting practices grow organically across teams, leading to duplicated effort and inconsistent results.
In light of this, CodePath recognized the benefit of building a more centralized and robust data foundation, but it faced a common challenge: how do you bootstrap data infrastructure, and how do you identify and hire experienced data professionals without prior in-house data expertise?
The approach¶
To bridge that gap, DrivenData partnered with CodePath to design and implement modern, organization-wide data infrastructure centered on clarity, consistency, and sustainability, while also launching a hiring process to assemble a team to sustain and advance the work. DrivenData worked closely with CodePath to carry out the following approach:
-
Collaboration with data stakeholders: Designing data infrastructure may sound like a purely technical challenge, but its success hinges on integrating the extensive domain knowledge of staff across an organization. DrivenData collaborated closely with stakeholders at all levels to grasp how data is generated and utilized in real-world scenarios. This collaboration ensured that data systems align with the organization's operational realities.
-
Organizing the data: We employed software engineering best practices and industry-standard tools to ingest, clean, organize, and document data so that everyone works from the same definitions and calculations, building trust in the data and its derivatives.
-
Empowering staff with tools to deliver business value from the data: Data infrastructure is only as useful as the tools it supports. We created a suite of interactive dashboards and reports, allowing staff across departments to get the insights they need on demand and without needing to write code.
-
Building internal capacity: To support long-term success, CodePath launched a hiring initiative for a full-time data engineer and a data scientist. DrivenData served as trusted data experts throughout the hiring process, deploying a technical take-home project and evaluating the candidates’ qualifications and suitability for their roles in the organization.
The results¶
CodePath now has robust data infrastructure that supports reliable decision-making across the organization. The key components include:
-
A scalable cloud data warehouse and transformation pipeline built on industry-standard technologies and modern architectural best practices resulting in a centralized data system—a “single source of truth”—that brings together information from across CodePath’s systems
-
Interactive dashboards and automated reports enabling staff to monitor program performance, trends, and student outcomes with up-to-date data
-
Training and documentation consisting of a library of resources such as dashboard user guides, video walkthroughs, and detailed business logic definitions, in addition to synchronous training sessions and office hours as new data products are introduced
-
An in-house data team including a dedicated data engineer and data scientist, empowering CodePath to independently carry forward the data infrastructure work and pursue more advanced analytics and machine learning initiatives in the future