Evaluation Clock

The Evaluation Clock (adapted from Pietro 1983) unpacks the evaluation or systematic study component of the Cycles and Epicycles of Action Research framework. The Clock indirectly addresses the planning component of the framework by making you look ahead to consider which people might be influenced by the results and what they could do based on the possible outcomes.

The ultimate goal of using the Clock framework is that you can use it to design your own evaluation or systematic study mindfully, working not only

sequentially—addressing the whole range of considerations (moving from steps 0 to 11)—but also
recursively—adjusting your plans for the earlier steps in light of thinking ahead about possibilities for the later steps.

In particular, evaluation and planning (or design) should be inextricably linked. For example, when you think about what could be done differently (step 11, below) on the basis of the specific measurements or observations you include in the evaluation (step 3), you may refine your measurements or observations. You may even decide to separate out two or more different sub-issues within the overall issue (steps 0-2), each requiring a different evaluation. As Pietro (1983, 23) says: “The clock marks time in an unusual fashion, since it does not necessarily move in a clockwise direction, but rather jumps from one number to another until all the questions have been struck.” It has been suggested that using the Clock looks more like undoing a safe's combination lock. Working sequentially and recursively is characteristic of Action Research as a whole, except that with the Evaluation Clock each step might require a tight, self-conscious method (see, e.g., Statistical Thinking for step 6).

Comparisons

Did the intervention have the intended effects? Was it better than other approaches? Answering such questions is a matter of evaluation, the systematic study of effects of some intervention or engagement. There is always a comparison involved. The comparison might be before versus after some intervention is made, or it might be a comparison of one situation (where a particular curriculum, treatment, etc. is used) versus another situation (lacking that curriculum, etc.) (steps 2 and 3 of the Clock). (The idea of comparison can also be applied to continuous data, e.g., on the incidence of violent crimes in relation to unemployment rate. This is, more or less, equivalent to asking if there more (or less) violent crime in times of high unemployment than in times of low unemployment?)

In valid comparisons all other factors are supposed to be equal or unchanged. If they are not, then the comparison is suspect. Perhaps it needs to be broken into a number of comparisons, e.g., before versus after for privileged schools, and before versus after for poor schools.

Evaluation may also be a matter of systematically studying what has already been happening. In that case, it may only involve collecting information about one situation, e.g., finding what percentage of adults are able to read competently. The formulation of the evaluation criteria and interpretation of the results depends, however, on an implicit comparison with a desired situation, e.g., one in which there is full adult literacy.

Learning to use the Evaluation Clock

In order to get acquainted with the comparison at the heart of the Clock and the sequential and recursive aspects of using it, it is helpful to reconstruct an evaluation that has already been conducted. When you do this you have to imagine being one of the people who did the research and fill in the steps they appear to have taken. In order to get the hang of comparisons, start by focusing on steps 2 and 3 for a simple case (e.g., Goode 1998 on the effects of a smoking in bars). Steps 0, 4 and 5 may help you as well. (These steps make up the stripped down clock appended below the full Clock.) When you have the hang of the comparison idea, then pay attention to the sequential and recursive aspects of the Clock.

The sequential part of reconstructing an evaluation means that the answers at each step are logically related to the previous ones, especially the immediately preceding one. For example, the lessons in step 10 are lessons learned from the reasons (step 9) for what is happening (step 8a). Similarly, the outlets (step 8b) should take into account the sponsors' goals and audience (step 1). Sequentiality also means that the key issues of the evaluation (step 2) cannot be the issues that emerge after the results (steps 8-12). The key issues must be what the evaluators saw needed studying before they knew the actual results.

The recursive part of reconstructing an evaluation means that when you think about what the evaluators or their sponsors did with the results (steps 10 and 11)—or what they could conceivably do with the results—you might go back and revise your interpretation of what decisions or policies or actions were at stake (steps 0 and 1). For example, an evaluation that points out that a low percentage of New York City high school students were passing the Regents exam says little about causes of the low percentage or about ways to improve education in the school system. We might even suspect that what concerned the sponsors of the evaluation (step 0) was to discredit public education. This conjecture would have to be validated, of course. in the meantime, however, we can note that someone wanting to learn how to improve public education would want to design a quite different evaluation.

When you try to make sense of evaluations that others have done or are proposing, you may see that parents, teachers, administrators, and policy makers want different things evaluated, even if the different wishes have been lumped together. For example, regarding high-stakes standardized tests, evaluations of the following different things are supposed to come from the one test: students' knowledge; new curricular frameworks as a means to improve students' knowledge; performance of teachers; performance of schools; and performance of school districts. In contrast to what happens with high-stakes tests, you should separate the different kinds of evaluation for any issue you are interested in, and address each evaluation appropriately. More generally, you should add notes from your own critical thinking about what others have done: Why evaluate in this situation? Why this evaluation and not another? What theories are hidden behind the intervention that was implemented? What supports are given to people to make the intervention?

A note on working from newspaper articles: Often a newspaper article will not give you information for every step in the clock. For the missing steps, fill in what you would do in the shoes of someone in the corresponding position, i.e., designing an evaluation (for the early steps), interpreting it (for the middle steps), or deciding on proposals to make (for the later steps). As in Action Research, deciding what you would do is a matter of making proposals that follow from research results and presenting the proposals to potential constituencies who might take them up if the research supports them.

Full Clock

0a. The intervention whose effect or effectiveness needs to be evaluated is...

“Intervention” here is an umbrella term for an action, a change in a program, policy, curriculum, practice, or treatment, a difference between two situations, etc.

0b. Interest or concern in the effect/iveness of the intervention arises because...

1a. The group or person(s) that sponsor the evaluation of the intervention are...
1b. The people they seek to influence with the results are...
1c. The actions or decisions or policies those people might improve or affirm concern...

2. General Question: To evaluate the effect/iveness of the intervention a comparison is needed between two (or more) situations, namely a. between... and...

b. with respect to differences in the general area of its effect on…

3. Specific observables: To undertake that comparison, the effects of the intervention will be assessed by looking at the following specific variable(s) in the two (or more) situations...

4. The methods to be used to produce observations or measurements of those variables are...(survey, questionnaire, etc.)

5a. The people who will be observed or measured are...
5b. The observing or measuring is done in the following places or situations... or derived indirectly from the following sources...

6. The observations or measurements will be analyzed in the following manner to determine whether the two situations are significantly different...

7a. Given that people who will interpret (give meaning to) the analysis are...
7b. the analysis will be summarized/conveyed in the following form...

When the results are available, the following steps can be pinned down. In the design stage, you should lay out different possibilities.
8a. The results show that what has been happening is...
8b. This will be reported through the following outlets...

9. What has been happening is happening because...

10. The lessons learned by sponsors of evaluation are that...

11. What the sponsors should now do differently is...

Stripping Down the Clock

(to focus on the comparison involved in evaluating the effects of any intervention)

0. The intervention whose effect or effectiveness needs to be evaluated is...

“Intervention” here is an umbrella term for an action, a change in a program, policy, curriculum, practice, or treatment, a difference between two situations, etc.

2. To evaluate the effect/iveness of the intervention a comparison is needed between two (or more) situations, namely a. between... and...

3. To undertake that comparison, the effects of the intervention will be assessed by looking at the following specific variable(s) in the two situations...

4. The methods to be used to produce observations or measurements of those variables are...(survey, questionnaire, etc.)

5. The people who will be observed or measured are...
This is done in the following places or situations... or derived indirectly from the following sources...

Goode, E. (1998). “When bars say no the smoking.” New York Times, 15 Dec.
Pietro, D. S. (ed.) (1983). Evaluation Sourcebook. New York: American Council of Voluntary Agencies for Foreign Service.