Playing the Game

Kings made tombs more splendid than houses of the living and counted old names in the rolls of their descent dearer than the names of sons. Childless lords sat in aged halls musing on heraldry.

— J. R. R. Tolkein, The Two Towers

image

If there’s anything that makes me lose the will to live, it is being in the same room as an educational Player of Games. I’m sure everyone is reasonably familiar with the type: “I want every intervention from now until the end of term focused on improving the A*-C pass rate for left-handed Y11 students whose birthday month has an R in it.”

Yes, it might help, marginally, in some sense. On such massaging of the margins are modern educational careers and reputations built.

Personally, such considerations leave me cold. Such teachers, it seems to me, hold their statistics in higher esteem than their students. The percentage is adjudged to be the outcome, rather than merely an indicator of a number of successful outcomes.

Sometimes, when I try and express this, people look at me as if I had twelve heads. It is a nuanced and subtle difference of emphasis, admittedly, but I think it’s a valid one. As an analogy, imagine a doctor who focuses on (say) a patient’s temperature to the exclusion of all else: “Doctor, I think I’ve broken my leg.”

“H’mm, let’s have a look. Actually, your temperature is a wee bit high. Here, let me apply this cold compress to your forehead.”

“But what about my leg?”

“Well, your body temperature is back to normal now. That means that we now have the officially mandated number of ‘healthy’ patients as per Ofdoc guidelines.”

“But what about my bloody BROKEN LEG?”

“My work here is done. Next patient please!”

The other note of caution that needs to be sounded more loudly in the education world is awareness of what is known as the Halo Effect.

I learned about this in Duncan Watts’ excellent book Everything Is Obvious (When You Know The Right Answer) in which he summaries the work of Phil Rosenzweig:

Firms that are successful are consistently rated as having visionary strategies. strong leadership, and sound execution, while firms that are performing badly are described as suffering from misguided strategy, poor leadership or shoddy execution. But, as Rosenzweig shows, firms that exhibit large swings in performance over time attract equally divergent ratings, even when they have pursued exactly the same strategy, executed in the same way, under the same leadership all along. Remember that Cisco Systems went from being the poster child of the Internet era to a cautionary tale in a matter of a few years . . . Rosenzweig’s conclusion is that in all these cases, the way firms are rated has more to do with whether they are perceived as succeeding than what they are actually doing.

— Watts, p.197 [emphasis added]

In one early experiment, several teams were asked to analyse the finances of a fictitious firm. Each team was rated on their performance and then asked to evaluate their team in terms of teamwork, communication and motivation. The high scoring teams assessed themselves very highly on these metrics compared with the low scoring teams, as you might expect. However, the kick was that performance scores had been allocated at random — there was no real difference between the teams’ performance at all. The conclusion is that the appearance of superior outcomes produced an illusion of superior functionality.

Watts argues persuasively that we tend to massively underestimate the role of plain, dumb luck in achieving success. He cites the case of Bill Miller, the legendary mutual fund manager who did something no other mutual fund manager has ever achieved: he beat the S&P 500 for fifteen straight years. Watts notes that this seems a classic case of talent trumping luck. However:

. . . right after his record streak ended, Miller’s performance was bad enough to reverse a large chunk of his previous gains, dragging his ten-year average below that of the S&P. So was he a brilliant investor who simply had some bad luck, or was he instead the opposite: a relatively ordinary investor whose ultimately flawed strategy just happened to work for a long time? The problem is that judging from his investing record alone, it’s probably not possible to say. [p.201]

I trust that I do not have to draw too many lines to highlight the relevance of these points to the education world. Outcomes, in the sense of exam grades, are currently the be-all and the end-all of education. But the Halo Effect makes it clear that a simplistic reading of successful outcomes can be highly misleading.

Negating the Halo Effect is difficult, because if one cannot rely on the outcome to evaluate a process then it is no longer clear what to use. The problem, in fact, is not that there is anything wrong with evaluating processes in terms of outcomes — just that it is unreliable to evaluate them in terms of any single outcome. [p.198]

Ofsted. managers and politicians please take note: our search for a signal continues.

Approval of what is approved of
Is as false as a well-kept vow.

— Sir John Betjeman, The Arrest of Oscar Wilde At The Cadogan Hotel

Advertisements

4 Comments

Filed under Data, Education, Politics, Society

4 responses to “Playing the Game

  1. Luck also intervenes in exams. Students who revise consistently are probably going to do better, on average, in tests. But you are always going to get good students who have an off day, or bad ones who get the single question they know something about. Winston Churchill wrote about how awful he was at school and the night before an important test, how he picked, at random, one country, New Zealand, out of 12 he might be tested on. The next day the question was on, yes, NZ and he got top marks. Jammy bugger.

  2. chrismwparsons

    “The problem, in fact, is not that there is anything wrong with evaluating processes in terms of outcomes — just that it is unreliable to evaluate them in terms of any single outcome. [p.198]”

    This I think is the fascinating thing to ponder. As well as the false attribution issue (“I took Vitamin C tablets for 2 weeks solid and my twisted ankle got better”), there’s also the side-effect issue of any technique/strategy/intervention. “Every time my skin’s a bit itchy I slap on the hydrocortisone!”

  3. Pingback: PIXL: panacea or poison? | e=mc2andallthat

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s