The politics of valuing
Thomas Schwandt and Emily Gates’ new(ish) book Evaluating and Valuing in Social Research is probably one of the most explicit attempts to look at how values and valuing shape evaluation.
Despite the fact that “value” lies at the heart of the word evaluation, the evaluation field has had a hot and cold relationship with explicit values. Schwandt and Gates define “value” as ‘normative (i.e., relating to a standard) and emotive commitments to what individuals, groups, and societies esteem, cherish and respect.’ For example, we value equal treatment under the law. On the other hand, they define “valuing” as the process of ‘how cherished and prized values are continually explained and examined as they are made plain in actions and practices.’
Their discussion of positive judgements (what is — descriptive and explanatory) and normative judgements (what should be) brings up important questions about valuation. Like Viviene Schmidt, I believe that cognitive (what is and what to do) and normative (what is good or bad about what is) ideas and the assertions we make about the world can be distinguished. Nonetheless, positive judgements are underpinned by normative judgements about what is good and bad in the first place.
There is really no such thing as “value-free” evaluation (or science, or for that matter, art). The assertion of “value-free” science is itself value-laden because it suggests that being value-free is not only possible but an eminently good thing.
As Schwandt and Gates discuss, it’s worth reflecting on the politics and values underpinning the “scientific” criteria we choose, as this shapes the methods used and data collected. When we assert that something is good or true we aren’t only making a statement about what is valued but also engaging in the political process of valuing.
My preferred definition of politics is Harold Laswell’s pithy ‘who gets what, when, and how.’ For our purposes, we might consider politics to be the process of negotiating power (i.e., discourse and decisions) regarding whose values get evaluated, what gets evaluated, when, how, and by whom.
Whose values
Fundamental to such a process is Robert Chambers’ question Whose Reality Counts? Historically, in evaluation (and elsewhere) it’s been people that look and sound like me (middle-class white men with letters after their name).
Patricia Rogers discussion of Nan Wehipeihana’s remarks at the Australasian Evaluation Society conference some years ago are worth remembering. I can’t find the original presentation, but according to Rogers, Wehipeihana discussed how evaluators make decisions about an evaluation without reference to the values of communities evaluated, what is important to them or what (in their view) constitutes credible evidence. This isn’t done accidentally, but by design. It stems from a technocratic elitism, and usually from a positivist epistemology.
In some respects, Bagele Chilisa goes one step deeper than the visible decision-making process. In her proposal for Made in Africa Evaluation, she underscores the importance of paradigms and questions whose ‘ways of perceiving reality (ontology), ways of knowing (epistemology) and value systems (axiology)’ are centred in the first place?
What we value
Our paradigms filter our assessments of what is true, credible, valid. For example, Jenni Downes and Amy Gullickson’s excellent article on what “valid” means in evaluation is almost entirely composed of definitions from North America. There is substantial definitional diversity within, but not so much in terms of whose values and validity criteria count.
Likely well aware of this, Schwandt and Gates advocate question and criteria-driven evaluation. I think this is a good idea. They point out that:
‘Choosing criteria commits the evaluator to look for certain kinds of evidence and to appeal to certain kinds of warrants in order to justify resulting evaluative claims.’
So, we need to determine whose criteria count and why these count for the questions we want to answer.
When we look for “warrants,” we’re effectively marshalling reasons and evidence to justify an argument or claim. Kareeem put the issue at hand nicely:
Evidence hierarchies
At this point, we come to the dilemma of evidence hierarchies. Schwandt and Gates draw particular attention to frameworks such as Grading of Recommendations, Assessment, Development and Evaluation (GRADE). They note the perennial issues of relying almost exclusively on experimental evidence in many such frameworks, particularly in evaluating multifaceted and complex problems. Cardiff University’s Specialist Unit for Reviewing Evidence (SURE) has a good list of critical appraisal tools. Many of these are useful, but as Schwandt and Gates argue, a strict technocratic focus on assembling evidence exudes a rather:
‘Naive, narrow notion of rational behavior… policymaking involves an array of non-scientific considerations, including political considerations and value preferences.’
So, we should bear in mind this wider view when we look at what counts as credible evidence.
The Department for International Development’s — DFID (2014) How to Note: Assessing the Strength of Evidence is among the most commonly used guides in the international development sector. One key reason for this is that DFID themselves funded many researchers to conduct evidence reviews and (presumably) prompted them to align with their own guidance. In my view, it’s generally a good guide. However, like any guide, it’s not value-neutral. It strongly reflects the sort of validity criteria that are valued where I live (transparency, reliability, validity, etc.) and is highly influenced by what is valued most in experimental methods such as reducing the risk of bias.
It’s not that reducing the risk of bias is unwarranted. There are also good reasons to take cognitive biases seriously, as I’ve discussed recently. I’ve recommended the use some of criteria from these assessment frameworks myself. For example, I think that being transparent and triangulating multiple perspectives are important things as warrants to justify evaluative claims.
Much of these criteria we take for granted. For instance, it’s common to argue that the consistency of evidence is important. It’s generally good news when various studies agree on what works in different locations. I think there is a good argument that consistency matters. But, have we really paused to consider, for instance, that diversity might actually be desirable for some people, under some circumstances? Little, if anything, is universally valid knowledge or evidence.
There isn’t much of a serious debate regarding why certain criteria should be valued over others, and whose ways of knowing and value systems ought to be considered as part of this process. The what works “evidence revolution” is largely silent on such questions. Yet as Enrique Mendizábal explains in this excellent thread, evidence doesn’t matter as much as we think it does:
We have to recognise that anchoring the discourse around particular biases and a constructed hierarchy of methods is exclusionary. This kind of evidence ‘is not always available or appropriate’ and it means that we can’t answer some of the most important questions in the first place, as Tim Harford wrote recently on the perils of nudge theory. I’m not sure the degree to which the mood is really shifting (nor that nudging is necessarily a catastrophic mistake). But, this kind of enquiry raises important questions for the evidence industry’s anti-politics machine and how well it wrestles with the big questions of whose reality counts and what evidence should count as credible.
For those that are interested, Marina Apgar and I will be conducting a training for the UK Evaluation Society on how to assess the quality of evidence in case-based approaches, which reflects our experience wrestling with some of these dilemmas in practice.