Thinking beyond the best practice myth of “best buys”
With Florencia Guerzovich
International development organisations are forever looking for shortcuts to “best practice” solutions — “best buys” that supposedly deliver better “value for money.”
We want simple answers to difficult questions and quick fixes to complex problems that (magically) deliver sustainable impact. There’s no shortage of answers provided by top ranked universities, evidence clearing houses, and Nobel prize winning economists.
As one of us discussed previously, Ruth Levine told us that “evidence on intervention effectiveness (and cost-effectiveness!) is a click away” through The International Initiative for Impact Evaluation (3ie)’s Development Evidence Portal. Indeed, the Global Education Evidence Advisory Panel co-hosted by the Foreign, Commonwealth & Development Office (FCDO), United Nations Children’s Fund (UNICEF), United States Agency for International Development (USAID), and the World Bank has told us what the smart buys are. There’s now even a Global Evidence Commitment to back up all this evidence-based solutionism.
These are all well-meaning, but unfortunately they are flawed and ultimately take us in the wrong direction. We will first point out some of the general problems with this way of thinking, and then focus on how this applies to social accountability (school management) efforts in the education sector to illustrate what’s going wrong.
The challenge of achieving something such as quality education in low and middle income countries isn’t solved by merely copying and pasting from a central database of supposedly “high-quality studies” where “what works” has been proven by “experts (A.K.A. development economists). ” That idea rests on a problematic assumption that assessing the true “value” of an intervention is a simple empirical exercise conducted (mostly) by (micro)economists based in the United States. Evaluating value involves choices regarding what is valued, reasoning, an appraisal of relevance, and a meaningful understanding of (political) context. These are invariably missing from such platforms.
Years ago, a Ministry of Education official in Mongolia explained to one of us his predicament: He didn’t need a “best buy,” he was looking for a “smart investment.” At the time, Mongolia was investing in training of school staff and management, in line with development partners’ recommendations. But there was a glitch in the plan. In Mongolia, like many other countries with clientelistic ties, schools can be an electoral bounty: every new electoral cycle, principals and teachers change. Spending in training was not really a smart buy in a context with a “rinse and repeat” dynamic. The official needed a way to make what he bought stick. In most low-and middle-income countries, electoral politics and clientelism systematically impede quality education. Last April, when we mentioned this example to Ministry officials in the Dominican Republic, they nodded vigorously as if saying: “you get my problem.” Indeed, Clio Dintilhac’s key takeaways from the recent #WWHGlobalEd24 included a similar concern.
These concerns reflect a bigger issue: we know that change in education systems doesn’t happen because a project either succeeds or fails, according to one’s preferred metrics. And yet, this idea continues to underpin several leading frameworks which tell us what “the evidence” is.
The “best buys” is a misleading way to assess the true value of interventions, particularly socially complex ones. They have several significant flaws:
- They assume that all interventions are trying to achieve the same thing (e.g., improve test scores) when they are often trying to achieve quite different things. When your primary aim isn’t to improve test scores, for example, you shouldn’t be assessed as if it were. Lant Pritchett argues that, for this reason, the results are intellectually incoherent.
- They exclude most of the relevant evidence in a sector. The Panel’s obsession with Randomized Control Trials (RCTs) means that the vast majority of their evidence is limited to what this highly limiting method is capable of studying. Relatedly, they focus on a handful of simple (potentially misleading) variables such as test scores at the expense of other learning outcomes, or more complex systemic change.
- They straitjacket interventions into faithfully replicating past interventions (fidelity) when what is actually needed, because contexts change over time and space, is for interventions to be adapted to new conditions and challenges. The fidelity logic makes these interventions brittle (i.e., they don’t travel well).
- The assessment is biased towards quick and simple solutions as the smartest buys. These include interventions such as providing information on the costs and benefits of education and structured lesson plans (i.e., the basics of teaching), which RCTs can easily measure. Complex and “harder to measure” interventions such as involving communities in school management, for example, are graded lower as promising but showing limited evidence (i.e., not enough RCTs).
- The best to worst categorization in the smart buys doesn’t reflect the political feasibility of each intervention. Even though the smart buys concede that context and political economy are important, the authors don’t take them seriously, leaving them to the judgement of policymakers themselves. Politics and political economy regularly get in the way of effective implementation. So, many of the “best practice” recommendations likely travel poorly.
A flawed review of “the” evidence
The “best buys” report is perched on the supposed strength of the evidence it reviews. And yet, the review fails to look across the evidence that exists and lacks evidence around what it speaks to. In a recent book on systems thinking and international education, Faul and Savage look into the track record of “so-called common sense” global expert technical solutions over the past decade. They reached a sobering conclusion: In Kenya, successful pilots that justify “best buys” have had “zero impact on student learning when rolled out nationwide because of lack of consideration of system factors” and no impact on school management or student outcomes in India and Indonesia.
We’ll now show what this looks like by delving into a specific set of interventions focused on school based management.
Despite assessing 13,000+ studies overall and a claim to focus on interventions that have demonstrated long-term positive impacts, the “smart buys” assessment of the value of school management interventions (promising but limited evidence) is based on only 5 (five!) locally bounded short-term interventions in Gambia, India, Indonesia, and Kenya. Predictably, these are all assessed through RCTs. Most are implemented by Panel members themselves, their family, or friends. So, the “limited evidence” is really just a feature of the Panel’s own cherry-picking of preferred studies from their nearest and dearest, some of which (such as the Indonesia case above) have very significant problems which they don’t mention because they weren’t reported in the RCT, but a qualitative study by the World Bank they ignored, and appears to have be essentially expunged from the public record. They also have serious blindspots. For instance, the India case experiment aimed to energise a village education council (VEC) but didn’t specifically target building relationships between village government and the school committee, it instead hoped that providing information alone would shift this power relationship, and the proxies they have for assessing relationships were very thin (e.g., did the parents visit the school), telling us very little about power dynamics.
A wider review of the evidence of 157 interventions we did tells a different, more context-informed, story. There isn’t a “best practice” approach that should be neatly replicated, instead there are “best fit” approaches where interventions are well adapted to spatiotemporal context.
The Global Education Evidence Advisory Panel uses evidence on test scores, but not on broader learning outcomes. It seems far too concerned about literacy and numeracy tests, assuming that these are good proxies for student learning. One multi-arm trial impact evaluation the Panel cited in Indonesia showed improved test scores in language and mathematics but the World Bank’s accompanying research (widely ignored) demonstrated that this focus led teachers to drop creative components of lessons because these weren’t in the service agreement on literacy and numeracy. Indeed, the way the intervention was implemented damaged staff morale and created conflict among teachers and between teachers and community members. So, suggesting that the initiative improved education quality is highly dubious. Quality education is about much more than just test scores. As Lant Pritchett reminds us, it’s about “learning, skills, ideas, competencies, dispositions, and behaviours.” Many of these are inadequately captured in tests.
We need alternative approaches to assessing the value of interventions that consider implementation context, political economy, and relationships as primary rather than secondary concerns.
In the next blog, we’ll unpack these further in relation to social accountability programming.