The City of Boston regularly inspects every restaurant to monitor and improve food safety and public health.
As in most cities, health inspections are generally random, which can increase time spent on spot checks at clean restaurants that have been following the rules closely — and missed opportunities to improve health and hygiene at places with more pressing food safety issues. Meanwhile, each year, millions of people cycle through and post Yelp reviews about their experiences at these same restaurants. The information in these reviews has the potential to improve the City’s inspection efforts, and could transform the way inspections are targeted.
DrivenData, in partnership with Yelp and Harvard, and with support from the City of Boston, structured a predictive challenge to tie Yelp reviews and ratings with the results of Boston’s hygiene inspections. The goal was to use data from social media to narrow the search for health code violations in Boston, pulling out the words, phrases, ratings, and patterns that predict violations to help public health inspectors do their jobs more effectively.
Modelers with the top-performing approaches made predictions for where minor, major, and severe hygiene violations would surface for 6 weeks into the future. Over that time, DrivenData compared their predictions with what public health inspectors actually found when they went into restaurants. At the end of the evaluation period, we saw which teams had developed the most accurate predictions.
The final results were studied by Harvard researcher Mike Luca and covered in the Washington Post:
"Using the winning algorithm, Luca says, Boston could catch the same number of health violations with 40 percent fewer inspections, simply by better targeting city resources at what appear to be dirty-kitchen hotspots. The city of Boston is now considering ways to use such a model."
Let us know how we can help your business or organization tackle your most pressing data challenges.