That’s an output, not an outcome
“That’s an output, not an outcome.”
In my day-to-day work, this is probably the most common thought in my head. Monitoring and Evaluation (M&E) folks tend to use it to school (rather than teach) non-M&E folks on something they supposedly ought to know, but don’t. I suspect, more than anything else, we tend to do this to sound clever and to vaunt our niche “expertise.”
But, is this something we really know? Is the distinction between an output and an outcome really so clear cut?
Just do a Google search with “what’s the difference between an output and an outcome?” and you’ll see a world of contradictions.
A recent analysis of the SmartyGrants Outcomes Engine from nine grantmakers and 664 grantseeker outcomes from across 19 funding rounds and 613 applications found that only a quarter were outcome statements.
A few months ago, Anne-Murry Brown ran a poll to which 802 people responded to the question: “Are you often confronted with persons being confused about the differences between outputs and outcomes?” 88% of respondents agreed.
Michael Quinn Patton also made a video to clarify the difference only a few months ago:
So, the distinction is an issue with which the evaluation field is still struggling.
The seduction of results chains
Most people are accustomed to see the following results chain:
In Patricia Rogers’ theory of change guide for UNICEF, we find the following distinctions between output and outcome:
- Output: The immediate effects of programme/policy activities, or the direct products or deliverables of programme/policy activities. For example, the number of vaccines administered.
- Outcome: The likely or achieved short-term and medium-term effects of a programme or policy’s outputs, such as a change in vaccination levels or key behaviours.
On one hand, I agree with Rogers’ distinctions. In her definitions there’s a logical chain from lower to higher levels of change and consistent flow from activities to impacts. On the other hand, the simple distinctions are potentially misleading. Rogers is very much aware of this, of course, and acknowledges that there are many ways to represent a theory of change. Nonetheless, a straightforward definition like this relies on how impact has been defined in the first place (and by whom) and an assumed hierarchy of results.
Such hierarchies are often contested. For instance, is reduced corruption a process outcome to the result of improved services? Does improved services lead to reduced corruption? Is it both, or neither? As Brendan Halloron reminded me, the answer depends on your implicit theory of change.
Logically, outputs follow activities. But, you could equally argue that there’s often co-presence. An activity and the output can be interwoven in the same “event.” Rather confusingly, Keystone’s theory of change guide even defines outputs as the products AND activities that you do.
For Rogers, the immediate effect of the activity of administering vaccines is that a vaccine have been administered. We might find the same for producing a report. The immediate effect of the activity of producing a report is that a report is produced. In this light, outputs are often just another way of phrasing the activities delivered (or actions) rather an “effect” of those activities (i.e., a change, result, or consequence). For a vaccine there is a direct effect (physical reaction) which comes from administering the vaccine (activity/output) without the patient making any choices.
So far, we’re chiefly talking about our own actions and behaviours in the project/programme: We administered vaccines; we produced a report.
What WE do and what THEY do
With such a model there’s still a risk of assuming we have more control over things than we actually do, but these spheres are a good place to start. I’ve seen many versions of such a diagram. CGIAR, for example, has a similar one where outputs are things within the sphere of control — chiefly products (e.g., research outputs), outcomes are what is influenced (behaviour changes), and impacts are the sphere of interest which refer to things like Sustainable Development Goals (SDGs), i.e., long-term effects linked to behaviour changes.
Diminishing control: read the diagrams right to left
Surprisingly, Michael Scriven’s Evaluation Thesaurus doesn’t actually include an entry for outputs. Nonetheless, there are, I think, more than one type of output hidden beneath Rogers’ definiton. Yet, we rarely distinguish these. We have a classic output as deliverable. As in the examples of vaccines and reports, the distance and potential friction between activities and outputs from the point of view of those administering/producing them is minimal. But, this is the wrong way of looking at an outcome chain (from left to right). We should be reading results chains from right to left. Does anyone read the report you produced and then do anything with the information? Does is just sit on a shelf gathering dust? Does it even make it onto someone else’s shelf to gather dust?
The other type of output you commonly see relates to stakeholder participation. This is actually rather more complex. On one hand, you’ll often seen # trainings delivered as an output. But, we also see # people attending or trained to be the output(s). The latter is somewhat problematic because it assumes that there are no barriers to participation and that participation is not a new activity or behaviour for those participants. This is why looking at results chains from left to right is misleading. Whether stakeholders attend is not within your direct control. You may have significant influence by offering information and incentives. Yet, something as mundane as someone attending an event or training for the first time or even deciding to get a vaccine may, in fact, reflect a significant change in behaviour for some individuals or groups. When I say “change,” I mean an act or process through which something (or someone) becomes different.
New actions and behaviours
I was in the Dominican Republic recently working with a World Vision project and talking about what outcomes to document (or harvest). One that arose from the project team was the participation of Haitian families in parent-teacher association (PTA) meetings. Ostensibly, this doesn’t appear like an outcome. Isn’t attendance just an output? But then when you think about changes for whom and under what circumstances a little harder, it could well be a significant change in behaviour. It is, after all, a new action for that family at the very least.
So focused are we on hitting a ‘participant’ target (the simple metric) as a deliverable that we rarely think very hard about the behaviour of people who don’t come and why, and what activities might turn that around. But, we really should — both from an ethical and a pragmatic standpoint.
How might we know if such a behaviour is significant? There are no end of potential criteria to consider, but we might think about whether it’s common for Haitian parents to attend these meetings or whether this is a new practice or pattern of behaviour in the school. Then we might think about whether this is plausibly connected to project/programe activities.
When I looked for further information, I was quite surprised by how limited and out of date the data are on the Haitian children in school in the Dominican Republic. Most of the data available was from 2015, and even these sources pointed back to data from 2012. Child and parent participation is situated within a wider context of visa problems, child labour, and large-scale child trafficking. So, numbers are politically sensitive.
In such a context, your “output” of # Haitian parents attending a parent-teacher association meeting might, in fact, be quite significant indeed. The challenge is to demonstrate whether it’s a new behaviour, or not.
Lab rats or agents of change?
While Scriven doesn’t mention outputs in his Evaluation Thesaurus, he does directly link effects and outcomes. He defines an effect as ‘an outcome or type of outcome.’ But this takes us round in circles. So, maybe we need to be more specific.
In Outcome Harvesting, an outcome is defined as:
“A change in the behavior, relationships, actions, activities, policies, or practices of an individual, group, community, organization, or institution (Wilson-Grau and Britt, 2012: 2).”
A focus on behaviour changes over effects takes us in a slightly different direction. Many logic models and theories of change are expressed patronisingly as though stakeholders such as “beneficiaries” were blank slates. This was unfortunately amplified by the randomista mania of the early 2000s which still tends to present participants like lab rats.
Yet, as Amartya Sen’s conceptualisation of agency reminds us, actors should be and are ‘active participant[s] in change, rather than … passive and docile recipient[s] of instructions or of dispensed assistance.’ If outcomes are about behaviour change, then they require choices, decisions, and actions.
In a similar light, Causal Link Monitoring focuses on actors’ use of outputs, as the diagram below shows:
A focus on use is a little less passive and patronising. It’s more agency-focused, and prompts us to read the diagram a little bit more from right to left. Use is, however, still too narrow a type of action or behaviour change. There are many other relevant verbs.
In my view, any type of new or different action by another actor should be considered an “outcome,” provided that those actions are partly in response to (i.e., a result or consequence of) the actions of the project/programme.
Prior conditions and decisions
In many respects, the diagram below from John Mayne is a more accurate representation of the long, multi-step, processes we’re talking about, whether we use the words output, outcome, impact or not:
There’s actually a lot required prior to the kinds of behaviour changes we’re talking about. Here Mayne focuses on reach and reaction (mothers reached). However, if you think about it, people deciding to participate in something they haven’t participated in before involves reasoning and a response (or reaction) to a stimulus. Scriven refers to a post-treatment effect, but in the case of the vaccine, patients make the choice to receive the vaccine (or not) before the vaccine can be administered. Some of the behaviour change therefore comes before the output.
Niki Wood suggests that the language of outputs and outcomes is out-of- date and argues that we should do away with them. Following Mayne, Wood advocates that we should ‘use plain language to describe the pathway that takes you from point A to point B’ and we should focus on behaviour change as the ‘crucial hinge point of your programme.’ Cathy Shutt argues that debates regarding the difference between outputs and outcomes can lead to people losing the behaviour change woods for the trees.
I agree that we should focus on behaviour change. But, to repeat my argument, ALL behaviour changes that can be plausibly linked to the project/programme (directly or indirectly) should be considered outcomes.
More important than the semantic gymnastics, in my view, is Mayne’s emphasis on behaviour change assumptions. I’ve written about this previously. As Cathy Shutt put it to me:
“The most important thing is making assumptions about how our actions will contribute to different change processes explicit and continually evolving these as our understanding of context, other partners’ or colleagues’ frames, and causal links grow.”
Shutt argues that when looking at challenging sectors such the transparency, accountability, and participation, a Capacities, Opportunities or Motivation (COM-B) framework or a Power, Capabilities, and Interests framework gets us quite a long way to making the connections between outputs and outcomes. Both are geared towards similar things. There is, I think, a growing consensus here; we can’t assume that behaviour change will just happen. Certain conditions need to be in place for interventions to have effects on people’s behaviour. Nancy Cartwright and Jeremy Hardy refer to these as support factors. Cartwright later refashioned these as moderators.
Periods and depths of effects
In Scriven’s Evaluation Thesaurus, we find that ‘outcomes are usually the post-treatment effects, but there are often effects during treatment (e.g., enjoyment of a teaching style).’ Scriven thus suggests that temporal order is an insufficient guide to delineate outputs and outcomes. The vaccine uptake example above also demonstrates this. There’s even a whole debate on simultaneous causation, if we want to add another layer of sophistication.
Scriven suggests that we should nonetheless try to distinguish immediate outcomes, end-of-treatment/intervention outcomes, and long-term outcomes. This suggests three periods and depths of effects. For a vaccine, we’re talking about downstream effects such as preventing illness or fatality.
It doesn’t really matter whether we consider outcomes as short, medium, and long-term changes as realists do or whether we embrace the OECD-DAC definition of impact, as Rogers does. It doesn’t matter whether these are positive or negative, intended or unintended changes. However, different depths of effects do matter because not all outcomes have downstream effects that result in improvements in people’s wellbeing, and effects can also dissipate over time (e.g., vaccines become less effective).
Brendan Halloran suggested to me that this relates to how we assess our theories of action/change, because we have to think about what milestones we may assume connect outputs to medium-term outcomes. For example, does advocacy engagement lead to decision maker commitment? How solid is that commitment? And is this a necessary step to further actions?
Halloran’s delineation between a ‘response’ and ‘responsiveness’ is geared towards distinguishing what might be one-off, isolated action with sustained and reliable patterns of positive response by governments to citizens. I would personally call this institutionalised responsiveness. But, I the takeaway is that you can’t take for granted the durability of effects.
Forwards, backwards, sideways
Another central problem with the output/outcome distinction is the direction of arrows. Results chains are just heuristics, but they are often interpreted as if they were a complete picture of what is required for change to happen. They aren’t. They reflect a likely direction of travel and a set of hypotheses about how and why change ought to happen.
In complex programmes and programming contexts, the heuristic of the result chain is constraining and potentially misleading. But, we squash these processes into the boxes to report to donors anyway. Results chains — with outputs and outcomes at their heart — imply that there are easily identifiable root causes that can be treated. Quite often, this isn’t the case. We commonly misidentify the problem(s) and particularly the likely interactions between these. It’s even been argued that there aren’t root causes in complexity.
Contrary to the unidirectional approach of the results chain, we should allow for causal (feedback) loops, because causal processes can be circular. Ideally, we want to represent this without turning diagrams into “horrendograms.” Feedback loops between outputs and outcomes can be positive or negative, and they can either accelerate or decelerate a process. This can be true within our own projects and programmes with complementary or contradictory interaction effects or they can be between the efforts within our project/programme/coalition and the efforts of others in a wider ecosystem.
Vaccination programmes seem simple, but vaccine uptake is actually quite complex due to the relevant information and political ecosystems they are embedded within. For example, hearing the positive experience of others getting vaccinated may encourage some others to get vaccinated. But, hearing negative experiences may do the opposite. This is nicely explained by Vox:
The video not only shows how you might have feedback loops within your own ecosystems, but potentially how the outputs from other systems can disrupt whether your outputs will lead to hoped for outcomes.
I hope I’ve prompted you to think a little harder about something so often considered to be simple. I don’t think the answer here is to have a unified definition of distinctions between outputs and outcomes. But, here are four things I think we can do.
Firstly, I think we need more humility as evaluators about these distinctions, particularly as we seem to disagree about the boundaries and these boundaries may vary in different sectors. Secondly, we need to be more honest about how these boxes we squeeze things into are incomplete heuristics. Thirdly, our focus should be the behaviour change (or the actions) of others, whatever we call the boxes. Fourthly, we should take into account likely positive and negative interactions to gain a fuller picture of what supports and inhibits change. I hope these will help us have a more thoughtful conversation.
Many thanks to Brendan Halloran and Cathy Shutt for their exchange on the debate and comments on an earlier brainstorm.