blog insights

(Tech) Infrastructure Week for the Nonprofit Sector

Reflections on how to build data and AI infrastructure in the social sector that serves the needs of nonprofits and their beneficiaries.

Peter Bull
Co-founder

This Good Tech Fest didn’t start for me with “Data science is bullsh*t, right?” like last year, but the 2025 edition was just as provocative. After the conference, I’ve been working to crystalize some of the conversations into trends in the sector. The conference was off-the-record, so please forgive me for not citing people directly.

I want to start with a reflection that GTF is a gathering of doers. These are the people actually building towards a better future with technology. Andrew called the group “impatient optimists,” and I think that spirit lived on throughout the conference. A note to other event organizers: focusing on facilitated discussions and an agenda with lots of open time creates space to build connections and momentum. This is the one event that consistently makes me feel like substantive ideas, directions, and partnerships came out of it.

With that frame, here are the concrete trends across the sector that emerged for me.

Nonprofit mycelia

A great talk on “Pando Funds” on the first day inspired discussions on how work gets funded. In addition to top-down systems where funders dictate priorities, we need to have structures which empower organizations working on an issue to coordinate and put funding and resources where the needs are. These priorities should be emergent rather than imposed, and they should benefit the cause, not an organization.

A key part of the presentation was looking at the history organizational charts for complex organizations. Including the “first” organizational chart! Included along with that was a discussion of system (or ecosystem) diagrams that map out how change happens across multiple organizations. To me, there did seem to be some enthusiasm from funders in the room for different models like this.

One interesting observation is that these visualizations are designed to make people see the structure of the system. However, they are not designed as references or interactive tools. I can’t easily see which organizations in a system diagram are working on the same issue as me. Making these kinds of diagrams does not go far enough. The diagrams need to become data that can be interacted with, and this generation of AI tools now makes it feasible to do that with much less effort.

This theme ties to discussions about the differences between organizations that have an earned-revenue model and those that don’t. I’ve long been a believer that earned-revenue is critical for organizations that aren’t at the front line. Organizations who provide services, platforms, and infrastructure that other nonprofits build on often struggle for grant funding which is earmarked for impact. We need a robust ecosystem of sector providers. These providers should earn money from nonprofits that need their services, and funders should support these expenses as part of grants they give.

Critical improvements for sector-level data

My ear was particularly attuned to discussion of sector-level data. It is exactly the kind of ecosystem-enabling tooling I was just mentioning. And when we talk sector-level data, the elephant in the room is 990s. However, the limitations of 990s have become even more apparent in recent months. First, they are immediately out-of-date if you’re interested in organizations responding fast to crises—say, for example, in the federal government. They simply aren’t a timely source of data. Second, they are US only. This leaves out a large chunk of the sector. Third, we can’t see the things we really need to see: who works together, who doesn’t, and why (applied but not funded, missions not overlapping, etc.). Fourth, we’ve basically commoditized processing 990s at this point. We need to get out of thinking of tax forms as our sector-level data.

So, what else besides 990s should we think about? I had a number of conversations about knowledge graphs, which are an old idea, but they are coming back around. Knowledge graphs have been hard to implement for the sector (or specific issue areas) because keeping them up to date is expensive (and often manual), and tools to query them are clunky unless you are Google. LLMs actually help alleviate both of these problems by making ingest of unstructured data easier and being able to reason about relationships in the knowledge graph. I’m looking forward to following up on potential here for a number of domains.

To that point, I also had a few conversations about how effective social-listening applications for the sector were missing. Since the bottom of X rotted out under new ownership, the landscape for how organizations communicate their work has fragmented and widened. We need to be able to parse across walled gardens like X, Meta properties, LinkedIn, but also federated systems like mastodon, BlueSky, and Threads. We need better realtime news and press-release monitoring. Without these social listening tools, we’re stuck only knowing what is told to us through the onerous task of organizations updating data about them in various systems.

Lastly, we need to treat grant applications and grant reports as sector-level data and include these in our “listening” to create data about the sector. We need to know what projects weren’t funded since every call has worthy applications that go unfunded. These should be easy to discover and for other funders to pick up with minimal process. Folks like the Philanthropy Data Commons are working on some of what we need to make this a reality, and we need to buy-in to these systems as a sector to make them succeed.

Product-first Nonprofits

This year more than ever I talked to product-first nonprofits. When I say product-first, I mean that their core offering is a technology enabled product that helps their beneficiaries. Rather than using technology, they own it, and that means they can design it for their use cases. This is a change that has been happening in the decade that I have worked in the sector, but it really hit home at GTF this year. We can finally respond to the people with needs that the market failed with user-centered interventions that are a great experience, not just a concrete need. In fact, there was just talk in the conference Slack of organizing a demo day for product builders.

I’ll call out a few in particular, but there were so many that it was clear this was a trend.

  • Everfree builds tools to support survivors of trafficking (I’m a little biased since we have been helping them build!)
  • Nexleaf Analytics creates software to manage vaccine cold chains
  • Justicia Lab develops AI tools to educate immigrants on their rights and support their access to justice
  • Same Same Collective creates AI chat experiences to support LGBTQI+ youth
  • Benetech develops inclusive technology to make reading and learning accessible, for example through their braille and large font ebook library, Bookshare
  • Emerson Collective supports a portfolio of tech- and data-enabled products in healthcare, edtech, climate and more
  • Giving Compass provides customized guidance to donors looking to connect with organizations

Moving beyond “soft” infrastructure

I was part of a number of conversations about “hard” technical infrastructure for the sector, such as the hardware, software, data, and technical tools that nonprofits use to achieve their missions. Conversations spanned from collectively owned datacenters and compute resources to software applications, datasets, algorithms/AI models, and other digital public goods. However, to many people the idea of actually building these kinds of infrastructure felt unachievable because funding, capacity, coordination and expertise were out of reach.

Some people I talked to fell back to the position that we need soft infrastructure instead like training, cross-org knowledge sharing, events and convenings, and legal and other services for interacting with vendors, and more. This felt achievable and fundable, but it’s not an inspiring vision. I think it’s a mistake to backpedal to a world in which we just fund soft infrastructure. We need to do the hard thing, and build hard infrastructure for the sector. Commercial actors will never be fully aligned with our priorities and principles.

One spark of inspiration was realizing that we have seen historical technologies, like electricity, move from being a luxury good to essential infrastructure. I think the same thing will happen for AI tools, and the more support for AI tools that the sector can provide the faster and smoother this transition becomes. These will become utility-level technical infrastructure for knowledge work, so we should get ahead of the curve.

The limitations on capacity and expertise to build hard infrastructure are also worth considering. This is not the first time I’ve had conversations about the ways in which great research labs of the past, like Xerox PARC, provide a breakwater for waves of technological innovation. The R&D lab is a model that concentrates experts in one place and gives them the freedom to explore applications without needing to commercialize any output. They work with product owners to then determine how new technologies are best used without the product roadmaps being buffeted about by hype waves. As a sector, we should think about a membership model R&D lab for data and AI. By synthesizing and de-hyping new technologies, they can provide guidance to philanthropies on promising directions, advise specific sectors on specific use cases, and create the prototypes and pilots that can make their way into nonprofit products. Treating this as a co-op both from a financial and participatory perspective means it can be shared infrastructure rather than duplicating this work across many orgs (or, worse, funding lots of projects that won’t go anywhere).

A key part of discussions around harnessing technology is that open-source won’t save us. We need to dispel the misconception that there will be open source products that we can just pick up and use. This is a misunderstanding of what makes open source work. Open source works because the users of the technology also have the expertise to spot what is wrong and fix it for everyone. This is a virtuous cycle that simply never gets off the ground for end-user products because the users with the feedback can’t fix the issues. We’re active open source developers, and it is amazing for developer tools. However, it’s not the means by which great products get built, it’s a foundation those products build upon.

AI Innovations for the Social Sector

Finally, there were a lot of one-off ideas that inspired me about how AI innovations could make their way into the nonprofit sector and what they might do for us. Here are a few of those ideas worth considering:

  • Privacy-enhancing technologies still come up a lot (and this is a space we work in), and folks are interested in applying them to GenAI, especially under the current administration. For example, we should build the rails for end-to-end encrypted chats with chatbots. We should make it easier to deploy federated models or on-device AI models where that makes sense.
  • We should have standards and practices for portable, vendor-agnostic tools. For example, if I spend a lot of time chatting with one LLM, I want to be able to port the “memory” it has of me over to a new system.
  • Organizations need AI sandboxes for safe, secure places to try out generative AI use cases without having their data shared with a third-party for training.
  • AI tools present the opportunity for effective, just-in-time training. We have talked about upskilling the nonprofit workforce and some orgs work on building skills in general. We should have AI tools that support learning in the moment when folks are asked to do a new task rather than education and training initiatives that stand on their own.
  • I had some discussions about AI-first nonprofit operations. The provocation is: how do we build dynamic, responsive, fast moving organizations by making them AI first? What are the risks and benefits?
  • We have the opportunity to build a sector-level Model Context Protocol Server. Imagine if any LLM from any vendor had access to deep knowledge about organizations from our sector data sources, could surface actions like donating directly in the chat or apply for a grant, and could report on trends across issue areas. Serving this knowledge into the contexts where end users are could make a big difference.

We at DrivenData would love to help push these concepts forward and make the technology for the sector better, particularly where we can make AI tools that prioritize the sector’s needs and drive the data infrastructure that facilitates it. Reach out if that’s interesting to you.

I’d love to hear what struck you, if you were there, and if not, what above do you think works or doesn’t?

Tags

Stay updated

Join our newsletter or follow us for the latest on our social impact projects, data science competitions and open source work.

There was a problem. Please try again.
Subscribe successful!
Protected by reCAPTCHA. The Google Privacy Policy and Terms of Service apply.

Latest posts

All posts

winners

Meet the winners of the AI for Advancing Instruction Challenge

Learn how the winners of the AIAI challenge leveraged multimodal classroom data to identify instructional activities and classroom discourse content.

case studies

Automating wildlife monitoring with Zamba & Zamba Cloud

DrivenData partnered with conservation researchers to create Zamba, an open-source machine learning solution that helps wildlife researchers process camera trap footage, reducing months of manual review to hours of automated analysis.

community

Community Spotlight: Paola Ruiz, Néstor González, Daniel Crovo

The Community Spotlight features fantastic members from our DrivenData community. Three members of the IGCPHARMA team, Paola Ruiz, Néstor González, and Daniel Crovo talk to us about data science, drug discovery, diverse databases and more!

community

Community Spotlight: Kirill Brodt

The Community Spotlight features fantastic members from our DrivenData community. Kirill Brodt, a researcher in computer graphics at the University of Montreal, talks animation, pose estimation, and data science challenges.

case studies

Jump-starting data infrastructure and in-house data expertise

DrivenData designed and built a data warehouse to centralize, organize, and visualize data across CodePath's operations. Our team also provided technical hiring assistance to find the right talent to carry the work forward.

case studies

A production application to support survivors of human trafficking

DrivenData developed Freedom Lifemap, a digital tool designed to support survivors of human trafficking on their journey toward reintegration and independence.

insights

Life beyond the leaderboard

What happens to winning solutions after a machine learning competition?

insights

(Tech) Infrastructure Week for the Nonprofit Sector

Reflections on how to build data and AI infrastructure in the social sector that serves the needs of nonprofits and their beneficiaries.

winners

Meet the winners of Phase 2 of the PREPARE Challenge

Learn about how winners detected cognitive decline using speech recordings and social determinants of health survey data

insights

AI sauce on everything: Reflections on ASU+GSV 2025

Data, evaltuation, product iteration, and public goods: reflections on the ASU+GSV Summit 2025.

resources

Open-source packages for using speech data in ML

Overview of key open-source packages for extracting features from voice data to support ML applications

tutorial

Getting started with LLMs: a benchmark for the 'What's Up, Docs?' challenge

An introduction to using large language models via the benchmark to a document summarization challenge.

winners

Meet the Winners of the Goodnight Moon, Hello Early Literacy Screening Challenge

Learn about the results and winning methods from the early literacy screening challenge.

case studies

Crowdsourcing solutions for AI-assisted early literacy screening

DrivenData ran a machine learning competition to develop models for scoring audio recordings from literacy screener exercises completed by students in kindergarten through 3rd grade.

resources

Where to find a data job for a good cause

Finding data jobs for good causes can be difficult. Learn strategies, job lists, and tips to find organizations with open positions working on causes you care about.

winners

Meet the Winners of the Youth Mental Health Narratives Challenge

Learn about the winning solutions from the Youth Mental Health Challenge Automated Abstraction and Novel Variables Tracks

winners

Meet the winners of the Forecast and Final Prize Stages of the Water Supply Forecast Rodeo

Learn about the winners and winning solutions from the final stages of the Water Supply Forecast Rodeo.

insights

10 takeaways from 10 years of data science for social good

This year DrivenData celebrates our 10th birthday! We've spent the past decade working to use data science and AI for social good. Here are some lessons we've learned along the way.

case studies

Mapping agricultural trends in Yemen during crisis

DrivenData partnered with The World Bank to use machine learning and remote sensing data to track agricultural changes across Yemen from 2019-2024, providing critical insights for food security planning in a conflict-affected region.

case studies

Making higher education data more accessible

DrivenData partnered with Science for America to develop scipeds, an open source Python library and interactive data visualization platform designed to simplify the analysis of U.S. higher education data from IPEDS and to illuminate trends and disparities in STEM education.

Work with us to build a better world

Learn more about how our team is bringing the transformative power of data science and AI to organizations tackling the world's biggest challenges.