Firelei Baez, Above Ground

Repositioning evaluation theory trees

Thomas Aston
10 min readAug 20, 2024

--

I was recently asked by Rosa María Flores Rodríguez for my thoughts on evaluation theory trees. For several years, Flores Rodríguez has been working on the formidable task of updating and refiguring the famous evaluation theory tree by Marvin C. Alkin and Tina Christie (2004). I look forward to seeing the result at the European Evaluation Society Conference (EES) 2024. I figured I’d share some of my thoughts here to contribute to the conversation.

The tree itself has been re-examined on several occasions by Christie and Alkin, Carden and Alkin, Alkin and Christie, and Alkin, Christie, and Stephen. Indeed, the Journal of MultiDisciplinary Evaluation recently put out a Special Issue on Visualizing Evaluation Theory in which Christie talks of a tree planted and growing. As an art aficionado, I was happy to see reference to several modern art history tress. Kat Haugh and Deborah Grodzicki sought to bring it to life at American Evaluation Association Conference in 2015.

Haugh and Grodzicki, 2016

Haugh and Grodzicki conducted a study which found that the “use” branch stole the show — of 390 responses, 100 votes were cast for utilization focused evaluation, followed by developmental evaluation (57), and participatory evaluation (51).

The evaluation theory tree has historically been predominantly a genealogy of white American fathers (see the May 12 Group). After much protestation through a 600-person strong petition, Alkin and Christie’s third version of Evaluation Roots in 2023 somewhat begrudgingly included Culturally Responsive Evaluation (CRE) and Indigenous Evaluation, and also Developmental Evaluation. While these are valuable additions, I was shocked to see how unashamedly North American it still is. This is what the latest version looks like:

Alkin and Christie, 2023 in Ong, 2024

Michael Quinn Patton is probably right that the evaluation theory tree should be an evaluation forest. And he’s surely right that these should be seen as interdependencies. In many respects, we’re clearly talking about different trees with different roots. Indeed, the lofty notion of evaluation being a “transdiscipline” should mean that we’re not so tightly bound by artificial disciplinary boundaries within the university subject “evaluation.”

I should also say that who or what you chose to dignify as an influence is both a subjective judgement and political statement:Evaluation theory is who we are,” as William Shaddish put it. We’re all influenced by different roots because we have different dispositions, preferences, and values. My aim here is not merely to outline the evaluation cannon, but to explain what I value and note some of the fissures in the field I see.

Mathea Roorda and Amy Gulickson remind us that “evaluation is by definition judgemental and value based.” In this light, I think we can probably sum up evaluation as a systematic process through which we make warranted judgments about what we value.

I love the following diagram from Megan Brown and Angelique Dueñas. In my view, it perfectly illustrates the roots question and the folly of a methods first approach. The best methods mix ultimately depends on what you value and where you come from.

Brown and Dueñas, 2020

So, I’d flip the tree on its side and start with what (and how) different people value.

Valuing

Thomas Schwandt and Emily Gates (2021: vii) define valuing as a ‘kind of practice that involves identifying, naming, considering, and holding or respecting something… as important, beneficial, right to do, good to be.’ Relatedly, they define evaluating as a ‘particular kind of empirical investigation … appraising, weighing up, assessing, calculating, gauging, rating and ranking.’ You can’t do evaluation without knowing first what you value. Indeed, if your values are poorly aligned with other evaluation stakeholders, then how you make judgements and the warrants you seek may be quite different. We don’t have to agree, but we need to be clear where we’re coming from. There are, of course, some evaluators who believe that their values have no bearing on the evaluative judgements they make. Yet, this view reflects a particular epistemic position and set of biases. In practice, you can’t escape some statement of valuing in evaluation. It’s in the name for a reason.

In my view, several contributions are crucial to valuing. Ernest House’s Values in Evaluation and Social Research questioned the relationship between facts and values. But Schwandt and Gates’ Evaluating and Valuing in Social Research is perhaps the most comprehensive statement on the role of values in evaluation today. So, it is well worth a read. But, perspectives on what we value and even what is out there to know are varied.

I think one of the great errors in how the tree is conceived is that it assumes common roots from a bunch of dudes in the 1970s, many of whom in retrospect (in soto voce) made quite mediocre contributions. Indeed, most people don’t come to evaluation through reading the scholarly cannon, and there are all manner of valuing and knowledge systems that long precede any academic discussion of evaluation. For this reason, simply adding culturally responsive evaluation or indigenous evaluation as some recent evolution of democratic or transformative evaluation is, in my view, profoundly misleading (even slightly offensive).

Different people and cultures value different things, and this underpins the judgements we make and the warrants we seek for these judgements. In my view, Bagele Chilisa’s greatest contribution in her Indigenous Research Methodologies and Made in Africa Evaluation is actually on relational axiology (values), ontology (being), and epistemology (knowing). This is far deeper than a mechanical selection of methods. Indeed, we find similar reflections on axiological, epistemic, and ontological foundations in the Equitable Evaluation Framework.™

Some of the ontological differences are quite radical. For instance, Chilisa points to not only relations among people, but also to relations with the living and the non-living (spirits and ancestors) as part of the Ubuntu philosophy which she argues underpins Made in Africa Evaluation. It’s not necessarily my personal reality, but it’s undeniably a reality for many of the 5.8 billion religiously affiliated persons in the world.

Reasoning

Following the base of valuing, we need evaluative reasoning — ‘the theory and practice of making warranted judgements about the value of a policy or program,’ as Julian King explains. Michael Scriven is evaluation’s official reasoning godfather. Evaluative thinking has become the new reasoning, and there have been some interesting contributions from Tom Archibald on this recently. King also reminds us that combining multiple viewing angles can strengthen evaluative reasoning (i.e., making warranted judgements). These different viewpoints often stem from different value systems, perceptions of what is real, and valorisation of different ways of knowing.

Epistemology is also fundamental to reasoning. Epistemic positions tend to underpin the logic of reasoning. We can discuss the age-old debate between (neo)positivists, constructivists, and realists. Yvonna Lincoln and Egon Guba’s seminal work on paradigms in Naturalistic Enquiry and Fourth Generation Evaluation are fundamental. If you have a naturalistic rather than “rationalistic” method of inquiry, what you find may be quite different. Similarly, realist evaluation is as much an epistemic stance as it is a methodology. Both of these are a challenge to post-positivism that the likes of Donald Campbell espoused in the “experimenting society.” This epistemic dialectic is a key fissure in the sector. Indeed, Lincoln and Guba developed different (qualitative) evaluative criteria around trustworthiness in contrast to Donald Campbell’s (quantitative) validity system.

More recent efforts to decolonise evaluation or to make it more culturally responsive are obviously about epistemology, but they are also about positionality. According to Jara Dean-Coffey, positionality “describes an individual’s world view and the position they adopt about a research task and its social and political context.” Positionality statements are an emerging trend — the American Journal of Evaluation recently made a call for papers about positionality statements in evaluation. As Dean-Coffey argues, who you are as a witness or participant matters to the evaluative judgments you make. Our different positionality also affects how we reason, what we consider to be valid knowledge and evidence, and whose perspectives we consider valuable, or not. Even without positionality statements, acknowledging our positions are an important aspect of claims to “truth,” whether personal or general.

Use

Michael Quinn Patton is inarguably a titan in the evaluation field, and Utilization-Focused Evaluation is doubtless a key part of the evaluation cannon. Patton’s argument was that an evaluation should be judged on its usefulness to its intended users. It’s quite difficult to assess its value given how ubiquitous discussions of evaluation use are today. Google scholar reveals 17,100 results. The 100 votes cast for utilization focused evaluation and 57 votes cast for developmental evaluation in Haugh and Grodzicki’s 2015 study also illustrate the widely-perceived importance of Patton’s contributions to the field.

Utilization-focused evaluation is probably most valuable as a principle rather than guidance. I’ve never seen anyone apply Patton’s 17 steps. Though, in a looser sense, utilization-focused evaluation has been the centripetal focus of Patton’s own infulential writing in Developmental Evaluation, Principles-Focused Evaluation, and Blue Marble Evaluation. I have mixed views of the value of each of these, but they are all undeniably highly influential in the field.

Participation

Participation is both a value position and a process. The value of participation is probably one of the greatest fissures in the field. There’s endless crossfire between those who argue that evaluation is objective, must be independent and eliminate bias (and reject positionality), and those who argue it’s values-based, should be participatory, and recognise that positionality is real and biases are inevitable even in supposedly “objective” and “independent” evaluation.

Some of the earlier work on participation is noticeably tentative. In 1976, Robert Stake wrote a Theoretical Statement of Responsive Evaluation was an attempt to loosen up evaluation from its rigid pre-defined constraints, and evolved into a discussion of responsiveness to programme participants in his Standards-Based and Responsive Evaluation. The title of Jennifer Greene’s Stakeholder participation in evaluation design: Is it worth the effort? from 1989 clearly captures the tentativeness. Lincoln and Guba’s Fourth Generation Evaluation in 1989 was considered daring at the time and it holds up extremely well today. While David Fetterman’s Empowerment Evaluation formally sits under the use strand, it’s clearly a better fit under participation because as BetterEvaluation notes, it’s ultimately about ‘providing groups with the tools and knowledge they need to monitor and evaluate their own performance and accomplish their goals.’

Today, there is a far more confident assertion of the value of participation and representation. For example, a decade ago evaluators such as Wehipeihana et al. argued that:

‘It’s not just that representation from the target population ethnicity or culture is “nice” or “good to have” on an evaluation team; you are actually going to seriously compromise the evaluation’s validity and credibility without it.’

The interstices between participation and rigour is one of the most interesting areas of innovation in evaluation today. The Causal Pathways Initiative, the Inclusive Rigour Co-Lab, and many others are seriously challenging the notion that participation and rigour or validity are necessarily contradictory. This is nicely expressed in Apgar et al’s rethinking rigour to embrace compelxity in peacebuilding evaluation. To respond to Greene’s question, a growing proportion of evaluation practitioners today would argue that the effort is often worth it.

Methodology

Downstream of values, reasoning, use, and participation considerations are more practical methodological considerations.

There have been some really helpful efforts to outline divergent method paths in recent years. Thomas Delahais has a great summary of the evolution of impact evaluation which explains the fissures between experimental, statistical, configurational and theory-based approaches to impact evaluation. Sebastian Lemire’s evaluation metro map captures some of this nicely too.

Lemire, 2020

So much of the energy and oxygen in the methods debate revolves around a disagreement about the value of Randomised Control Trials (RCTs). I’ve written about this an unhealthy amount. However, we need to understand what lies beneath the fissure between the randomistas and their critics. The the dialectic goes something like this:

Option A: Randomistas tend to value objectivity and independence (and believe these are real things), they tend to espouse a positivist epistemology which implies a single truth can be known, they usually embrace Campbell’s (quantitative) validity system, they tend to believe that they can isolate specific effects and derive an unbiased estimate of the effect of adhering to intervention.

Option B: Anti-randomistas tend to believe in the importance of positionality and participation, they typically espouse a constructivist or realist epistemologies which (especially the former) imply that there are multiple valid perspectives on the truth, they tend to embrace something akin to Lincoln and Guba’s criteria of trustworthiness, and they believe that change happens in complex systems, so they see the interaction between interventions and their context are key to causal explanation.

Evaluators who resonate with option A will are more likely to be influenced by the work of Donald Campbell and his acolytes, Tom Cook, and William Shadish on experimental and quasi-experimental approaches to evaluation. As Delahais shows, this in-turn laid the foundations for Nobel prize winners, Esther Duflo, Abhijit Banerjee, and Michael Kremer and their disciples who favour the orange method line.

Evaluators who resonate with option B tend to be indirectly influenced by Carol Weiss, Huey Chen and Peter Rossi on theory-based evaluation alongside Robert Yin’s work on case studies, and Lincoln and Guba’s Fourth Generation Evaluation, even if they haven’t read these works (few outside formal evaluation degrees have).

What kinds of methodological combinations evaluators seek relies upon many of the priors above. So, let’s be transparent about them. The different iterations of the evaluation tree are helpful reminders of both shared foundations and areas of divergence. Let’s hope we can get better at seeing the wood for the trees.

--

--

Thomas Aston

I'm an independent consultant specialising in theory-based and participatory evaluation methods.