The Platinum Standard

January 16, 2015

The rise of results-driven strategies like evidence-based policymaking, impact investing, social impact bonds, collective impact, shared value, and public value has been largely dependent on the premise that we can reliably measure social outcomes. But evaluating impact is hard stuff — studies can take years to complete and cost millions of dollars for just one program. Few can argue with investing in social programs that work, or the premise that government should use evidence for budgeting decisions. But the dilemma facing policymakers and philanthropists remains: how can you be “evidence-based” if we don’t yet have an ample and actionable evidence-base? Indeed, only a handful of “top tier” evidence studies exist. The answer may be found in the burgeoning field of applied social research known as meta-analysis.

Many consider randomized control trials, or RCTs, to be the “gold standard” for evidence. Randomization provides a high degree of proof for a very narrow set of facts: a particular program, under a particular set of conditions, for a particular population of people, at a particular time, made a difference. But while the specificity of RCTs can make them very credible, their precision also creates a number of critical limitations. Moreover, if the goal is to develop a universal evidence base for policymakers to use, getting there one RCT at a time could take decades and cost billions of dollars, even using low-cost RCTs.

There might be another way. Meta-analysis is a research approach that capitalizes on the value of multiple RCTs and brings predictive power to social impact. Think of meta-analysis as conducting “research about previous research. ”Meta-analysis allows researchers to analyze and synthesize the characteristics of programs and the effects of those programs in a systematic, replicable manner (Lipsey, Effectiveness of Juvenile Justice Programs). Meta-analysis can improve the inferential value individual RCTs. According to RA Fisher, one the early pioneers of randomization: “when a number of quite independent tests of significance have been made, it sometimes happens that although few or none can be claimed individually as significant, yet the aggregate gives an impression that the probabilities are on the whole [better] than would often have been obtained by chance.”

We would not be the first sector to use analytics as a way to predict outcomes. In fact, we may be one of the last. Many other sectors — from finance, to healthcare, to music, to baseball — have successfully combined meta-analytic techniques with applied statistics and database technology to improve outcomes. Credit bureaus such as Experian have developed statistical models to isolate the variables that are statistically correlated to loan repayment. This capability has transformed lending, allowing lenders to successfully predict creditworthiness and financial outcomes, in turn enabling them to make more accurate and efficient decisions. The Human Genome Project has aggregated RCTs and other genetic research to help scientists understand common diseases, design more effective medications, assess risks, and more generally predict health outcomes. AndPandora Radio created a highly detailed analytical system called the Music Genome Project® that quantifies all musicological and experiential aspects of a song or musical work in a standardized manner within each music genre. Every song has its own “genomic imprint” and can be algorithmically matched to other songs with correlated attributes that are already favored by a listener; this “meta-analytical” approach enables Pandora to successfully predict which new songs will match a listener’s existing musical preferences.

Benefits and Challenges of Meta-Analysis

Indeed, one of the most powerful benefits of meta-analysis may be the ability to quickly and affordably create a “quasi” evidence base for all programs, even those that haven’t been specifically studied. According to Mark Lipsey, author of Practical Meta-Analysis, and a leading evaluator in the field of juvenile delinquency: “Evidence-based practice can be extended beyond brand-name model programs to those many local and home-grown programs that are more generic instances of program types whose effectiveness is adequately supported by research.” In other words, meta-analysis has the potential to turn gold standard studies into a “platinum standard” evidence base.

Another benefit of meta-analysis is that it can help us identify the factors across contexts and programs that lead to differential success. This is critical. Because if we can derive the social impact “genes” that determine outcomes, we can more consistently measure and evaluate existing programs, and design better programs. For example, in John Hattie’s “Visible Learning” –a meta-analysis on student achievement, he analyzes evaluations across many educational programs and academic contexts to understand the degree of influence that certain factors have on student success. His findings present an opportunity for educators and practitioners to improve student outcomes by introducing these proven factors into programming. By identifying the factors and contexts that lead to program success across a large number of evaluated programs, a properly done meta-analysis can lead to actionable, predictive conclusions about the likely effectiveness of programs not yet studied.

A growing number of researchers are using meta-analysis to codify the evidence in disciplines such as juvenile justice, prisoner re-entry, student achievement, service learning, after-school programming, obesity prevention, and more. Aggregating this knowledge, codifying it, and making it available to policymakers could create a fast-track to the universal, predictive evidence-base we’ve all been waiting for.

No doubt, there are challenges to meta-analysis as well: the number of studies available may not be sufficient; they may be of poor quality; and effects may not be homogeneous, so that grouping different causal factors may lead to meaningless estimates of effects. Moreover, meta-analysis requires that the analyst make a number of consequential decisions that could allow biases to creep in: what kind of studies (or program evaluations) should be included in the meta-analysis? Should only “gold-standard” RCTs be included? If so, how do we decide which ones meet that standard? How should outcomes be standardized so they are comparable across studies? How do we incorporate qualitative information into the analysis? Or should we? If these kinds of questions can be answered well, the widespread application of meta-analysis to questions of social policy will vastly expand the actionable evidence base beyond the relatively small number of narrowly-focused RCTs and existing meta-analyses available for most social outcomes.

We may get there sooner than we think…

The Impact Genome Project®

In 2013, Mission Measurement launched an ambitious, field-wide meta-analysis to codify RCTs, consider the results of other meta-analyses, and incorporate findings from existing quantitative and qualitative research in every field of social policy. We call it the Impact Genome Project™ (IGP). The IGP was announced at the Skoll World Forum in Oxford on April 10, 2014 and later at the White House on June 25, 2014. The promise of the IGP is to create a universal evidence base upon which we can derive predictive analytics for policymakers and practitioners. The IGP will do this by standardizing the definition of universal outcomes in every field of social policy; collecting, grading, and codifying the existing evaluation literature in each given field; and statistically deriving the efficacy factors that can help us predict the outcomes of any social program. Using our universal evidence base, we can then derive benchmark data to encourage more efficacious and cost-efficient programming to create more meaningful impact throughout the sector. Policy makers, program directors, and philanthropic leaders alike can use this data to make more informed decisions about programming. Industry response to the IGP thus far has been very encouraging; a recent SSIR webinar on the IGP attracted over 2,700 registrations and generated 365 questions. Subsequent to this webinar, nearly 400 social-sector organizations have signed-up to participate in the IGP.

Genomes are already under development in a number of different fields, including critical human needs, education, science & technology, and international development. The first genomes are expected to be announced at the Clinton Global Initiative Winter Meeting. In all, there are twelve genomes in the IGP: Education, Critical Human Needs, Youth Development, Health, Culture & Identity, Criminal Justice, Economic Development, Sustainability & Environment, International Development, Science & Technology, Arts, and Disaster Relief.

The IGP is not without controversy. Some have questioned the sufficiency of available evidence to produce meta-analyses in some fields. Others have questioned the ethics of using predictive data in the field of social impact, or the IGP’s ability to recognize innovation in social programming. Still, even with these caveats, the value of meta-analysis likely outweighs the alternatives: guessing, on the one hand, or waiting until we have “perfect” data, on the other.

Conclusion

Policy makers today need more than just a static library of RCTs to make good decisions; they need analytic tools that can help predict the outcomes of programs, compare benchmarks, and design more effective social interventions. Meta-analysis and the Impact Genome Project™ have the potential to create a step-change in the way we govern and invest in social change. Turning “gold” into “platinum” will dramatically lower the cost of evaluation for all social programs, help us identify the “gaps” in research where RCTs are best served, and make evidence-based policymaking real within our lifetime.

— Jason Saul, Founder and CEO, Mission Measurement

— Nolan Gasser, Ph.D., Chief Genomic Officer, Mission Measurement, Chief Musical Architect of the Music Genome Project, and Adjunct Professor in Medieval–Renaissance Music History, Stanford University

— Randy Stevenson, Ph.D., Director of Data Science, Mission Measurement and Professor of Political Science, Rice University