Category Archives: Assessment

Look at the pretty pictures…

Uniformity of practice seldom continues long without good reason.

So opined the estimable Dr Johnson in 1775. In other words, if a thing is done in a certain way, and continues to be done in that same way for a number of years by many different people, then it is a pretty safe bet that there is a good reason for doing the thing that way. And this is true even when that reason is not immediately apparent.

For the choice of this situation there must have been some general reason, which the change of manners has left in obscurity.

— Samuel Johnson, A Journey To The Western Islands of Scotland (1775).

Consider the following examples of “uniformity of practice”:



They are fairly bog-standard GCSE examination questions from the last two years from three different exam boards. But compare and contrast with an O-level Physics paper from 1966:




The “uniformity of practice” that leaps out at me is that the more modern papers, as a rule, have many more illustrations than the older paper. Partly, of course, this is to do with technology. It would have been (presumably) vastly more expensive to include illustrations in the 1966 paper.

Even if we assume that the difficulty level of the questions in the modern and older papers are equivalent (and therein lies a really complex argument which I’m not going to get into), there is a vast difference in the norms of presentation. For example, the modern papers seems to eschew large blocks of dense, descriptive text; this extends to presenting the contextual information in the ultrasound question as a labelled diagram.

Now I’m not saying that this is automatically a good or a bad thing, but there does seem to be a notable “uniformity of practice” in the modern papers.

Now what could the “general reason” for this choice?

Rather than leave the “change of manners” responsible for the choice “in obscurity”, I will hazard a guess: the examiners know or suspect that many of their candidates will struggle with reading technical prose at GCSE level, and wish to provide visual cues in order for students to play “guess the context” games.

Now I’m not assigning blame or opprobrium on to the examiners here. If I was asked to design an exam paper for a wide range of abilities I might very well come up with a similar format myself.

But does it matter? Are we testing Physics or reading comprehension here?

My point would be that there can be an elegance and beauty in even the most arid scientific prose. At its best, scientific prose communicates complex ideas simply, accurately and concisely. It may seem sparse and dry at first glance, but that is only because it is designed to be efficient — irrelevancies have been ruthlessly excised. Specialised technical terms are used liberally, of course, but this is only because they serve to simplify rather than complicate the means of expression. 

Sometimes, “everyday language” serves to make communication less direct by reason of vagueness, ambivalence or circumlocution. You might care to read (say) one of Ernest Rutherford’s papers to see what I mean by good scientific prose.

The O-level paper provides, I think, a “beginner’s guide” to the world of scientific, technical prose. Whereas a modern question on falling objects might tack on the sentence “You may ignore the effects of air resistance” as an afterthought or caveat, the O-level paper uses the more concise phrase “a body falling freely” which includes that very concept.

To sum up, my concern is that in seeking to make things easier, we have actually ended up making things harder, and robbing students of an opportunity to experience clear, concise scientific communication.


Filed under Assessment, Education, Physics

The Gamesters of Sparta

Sir. It must be considered, that a man who only does what every one of the society to which he belongs would do, is not a dishonest man. In the republick of Sparta, it was agreed, that stealing was not dishonourable, if not discovered.

— Samuel Johnson

At a recent event, the speaker asked us to consider a hypothetical conundrum: what if one GCSE Triple Science student was strong in (say) Chemistry and Biology, but significantly weaker in GCSE Physics? 

What course of action would you recommend? Extra support in Physics, was the consensus reply. 

Actually, said the speaker, the smart “Progress 8 Maximisation Strategy” would be to:

  1. Tell the student to focus her efforts entirely on Biology and Chemistry and completely ignore Physics. . .
  2. . . . but keep her entered for GCSE Physics anyway, and make sure that she goes into the exam hall and writes her name on the Physics papers, even if she does nothing else.

That way, she has ostensibly followed a full and balanced curriculum. She has, after all, been entered for all three Science subjects.  And, since Progress 8 counts only the two highest Science grades (or so I’m told), the student’s contribution to the school’s league table position would be also be secure.

H’mm. Dishonest? No. In the school’s best interests? Definitely. In the student’s best interests? Erm . . . on balance, no.

Sadly, as the character Joseph Sisko (ably played by Brock Peters) once observed on Star Trek: Deep Space Nine: “There isn’t a test that’s been created that a smart man can’t find his way around!” And that includes Progress 8 . . .

Sir, I do not call a gamester a dishonest man; but I call him an unsocial man, an unprofitable man. Gaming is a mode of transferring property without producing any intermediate good. 

— Samuel Johnson


Filed under Assessment, Education, Society

They Wouldn’t Let It Lie: The Twelve Physics Pracs of Gove (Part 3)


One time a whole lot of the animals made up their minds they would go shares in building a house. There was old Brer Bear and Brer Fox and Brer Wolf and Brer Raccoon and Brer Possum — everyone right down to old Brer Mink. There was a whole crowd of them, and they set to work in building a house in less than no time.

Brer Rabbit was there too, of course, but he said it made his head swim to climb up the scaffold and build like the others, and he said too that he always got sun-stroke if he worked in the sun — but he got a measure, and he stuck a pencil behind his ear, and he went round measuring and marking, measuring and marking. He always looked so busy that all the other creatures said to each other that Brer Rabbit was doing a mighty lot of work.

And folk going along the road said that Brer Rabbit was doing more work than anyone. But really Brer Rabbit wasn’t doing anything much, and he might just as well have been lying by himself in the shade, fast asleep!

Brer Rabbit Gets A House by Enid Blyton

Of all the things that could be dumped overboard during a radical curriculum overhaul, the dreadful, unholy mess known variously as “controlled assessment” or “coursework” or “practical assessment” (but whose names are actually legion) would certainly get my vote.

So, I was actually faintly encouraged by the reformed A-levels insistence that students have to DO twelve “required” practicals, and that all schools have to provide is evidence that their students have DONE those practicals to allow a “practical endorsement” to be ticked on exam certificates. Assessment of students’ practical skills would be in the final examinations.

In my naivety, I thought that a set of properly written laboratory notebooks would be sufficient evidence for the practical endorsement to be awarded. I actually enjoyed explaining the protocols for keeping a lab book to our AS Physics group; for example, the idea that it should be a contemporaneous working document, replete with mistakes and crossings out — proper science in the raw, so to speak, warts and all. And not a suspiciously pristine, antiseptic and bowdlerised “neat” copy. And I thought our students responded gamely to the challenge, even down to worrying whether the pen with erasable ink counted as an ‘indelible pen” or not.

But, goddammit, the latest email from our exam board shows that the JCQ wouldn’t let it lie, they wouldn’t let it lie.

Now, we have to minutely “track” (dread word!) our students’ practical skills, verily even unto recording onto the Holy Spreadsheet if we have indubitable observational evidence of each student reading an Instruction sheet or not.

Oh deary deary me. It calls to mind Wilfred Owen’s memorable lines about being fit to “bear Field-Marshal God’s inspection”.

But it won’t be Field-Marshal God inspecting us. Instead, it will be a vast floppy-eared army of snaggletoothed practical-assessor-Brer Rabbits, hopping all over the land, measuring and marking, marking and measuring…

I give up: this seems to me like defeat, a return to the discredited and unlamented paradigm of controlled assessment. This is defeat, a totally avoidable defeat that has been snatched from the ravening jaws of victory…

The Twelve Physics Prac of Gove Part 1
The Twelve Physics Pracs of Gove Part 2
Bring Back POAE!


Filed under Assessment, Science

Educational Defeat Devices


The Volkswagen Emissions Test Defeat Device needs no introduction:

Full details of how [the defeat device] worked are sketchy, although the EPA has said that the engines had computer software that could sense test scenarios by monitoring speed, engine operation, air pressure and even the position of the steering wheel.
When the cars were operating under controlled laboratory conditions – which typically involve putting them on a stationary test rig – the device appears to have put the vehicle into a sort of safety mode in which the engine ran below normal power and performance. Once on the road, the engines switched out of this test mode.
The result? The engines emitted nitrogen oxide pollutants up to 40 times above what is allowed in the US.
BBC News 4/11/15

This perceptive post from cavmaths shows , I think, the danger of relying on widely used educational “best practice” short cuts. They can actually be deleterious to student understanding. In short, many of them are simply “educational defeat devices”, clever tricks designed to give a false impression of student performance under artificial test conditions, cheats that fall apart when tested in the real world.

1 Comment

Filed under Assessment, PIXL, Society

PIXL: panacea or poison?

“How the understanding is best conducted to the knowledge of science, by what steps it is to be led forwards in its pursuit, how it is to be cured of its defects, and habituated to new studies, has been the inquiry of many acute and learned men, whose observations I shall not either adopt or censure”.
–Samuel Johnson, The Rambler, April 1750

A colleague described a recent visit to a highly successful science department that has drunk mighty deep of the PIXL well. I shall summarise some of her observations and comments below. My reactions varied from intrigued to puzzled to horrified, but in keeping with the Johnson quote above, I shall endeavour to urge neither adoption nor censure — at least until I have thought about it some more.

Item the first: textbooks are forbidden. Students are taught from in-house PowerPoints and worksheets which are made available online for individual study by students. My colleague reported that she visited several classes in the same year group, and all the teachers were teaching the same topic with the same PowerPoint — and were often on exactly the same slide at the same time! Reportedly, this system was set up because science leaders were not satisfied with the quality of lessons being planned by individual teachers. For myself, I couldn’t help but be reminded of the Gaullist education minister who claimed to know which page of which textbook children throughout France would be studying on that very day . . .

Item the second: science leaders have exhaustively analysed the GCSE exam board specification to produce the materials mentioned above. Every learning point is translated into “student friendly” language and covered in detail. My information is that a typical starter activity might be for students to copy down a summary of important information from a PowerPoint, before practising application using worksheets and past paper questions. These are often peer marked. Since planning and resource making have been centralised, the workload of the classroom teacher appeared to be more manageable than in many schools.

Item the third: students are regularly tested. Test papers are gone over with a fine tooth comb by the science team and areas of weakness identified. These are addressed in large, multiclass study skills sessions led by the head of science in the assembly hall, teaching from the front (brave woman!) using an old fashioned OHP and transparencies! (Sigh! Now that takes me back: I can almost smell the banda machine solvent as we speak.) Students are sat at exam desks for the session, and the hall is supervised by teaching staff and SLT (including the headteacher on the day my colleague visited). This is followed by a “walk and talk” mock (i.e. the answer is modelled by the Head of Science on her trusty OHP), followed by individual exam practice under exam conditions.

And so we come to the question: shall we adopt or censure these observations?

The truth is: I am not sure.

On the one hand, I can see how this might be a rapid and effective way to improve results, especially in a school with an inexperienced science team. And the part of me that actually likes writing schemes of work and resources would relish the challenge of developing such a scheme. And I’m told that percentage science pass rates improved significantly from the low teens to the high eighties . . . over the course of a single year! And you can’t really argue with such success, can you? (Actually, yes you can — see this post on the Halo Effect) Also as Lt. Worf of the starship Enterprise once observed: “If winning isn’t important, why keep score?”

And yet . .

Part of me rebels at such regimentation. Is this an example of the “mcdonaldisation” of education, the continuing process of deskilling the classroom practitioner? I genuinely hate to say this, but given this model maybe Sir Ken Robinson has a point; although this particular iteration seems to owe more to Taylorism rather that the nineteenth century workhouse.

Use another teacher’s PowerPoint? Ugh! I’d rather be forced to use his toothbrush . . .

And, while I grant that many examination questions are indeed fit for purpose and thoughtfully designed to expose misunderstandings and misconceptions, I cannot help the feeling that our examination system has become an overly-powerful tail wagging an emaciated dog.

Is learning truly synonymous with exam success? Have we become so enamoured of the assessment of learning rather than learning itself that we, like the Scarecrow in The Wizard of Oz, would not consider ourselves truly learnèd unless we hold a diploma saying that we are?


Why, anybody can have a brain. That's a very mediocre commodity . . . great thinkers . . . think deep thoughts and with no more brains than you have. But they have one thing you haven't got: a diploma.

I shall leave the final word to my friend Sam Johnson:

“The great differences that disturb the peace of mankind are not about ends, but means. We have all the same general desires, but how those desires shall be accomplished will for ever be disputed.”
The Idler, December 1758


Filed under Assessment, Education, PIXL, Science, Society

Never Mind The Data, Feel The Noise (or, seek the signal, young Jedi)

Everyone in education loves data.


This is the only time it is correct to use the word "Data" in the singular...

Or at least claims to. One sometimes wonders what would happen to the UK education system if a computer virus disabled every Excel spreadsheet overnight — h’mmm, perhaps someone should get in touch with those nice hacker people at Anonymous . . .

However, I digress. I wanted to share a recent epiphany that I’d had about data, particularly educational data. Perhaps it’s not much of an epiphany, but I’ve started so I’ll finish.

It came when I was listening to an interview on the evergreen The Jodcast (a podcast produced by the Jodrell Bank Radio Observatory). Dr Alan Duffy was talking about some of the new technologies that need to be invented in order to run the new massive Square Kilometre Array radio telescope (due to begin observing in 2018):

And then we have to deal with some of the data rates . . . essentially we recreate all of the information that exists on the internet today, and we do that every year without fail, it just keeps pouring off the instrument. And what you’re looking for is the proverbial needle in the haystack . . . how do you pick out the signal that you’re interested in from that amount of data?
The Jodcast, October 2014, 18:00 – 21:00 min approximately [emphasis added]

The realisation that hit me was: it isn’t the data that should be centre stage — it’s the signal that’s contained within that data. And that signal can be as hard to find as the proverbial needle in a haystack, even without data volumes that are multiples of the 2014 Internet.

A simple example from the history of science: Edwin Hubble’s famous graph from 1929 that was one of the first pieces of evidence that we exist in an expanding universe. The data are the difficult and painstaking measurements made by Hubble and his colleague Vesto Slipher that are plotted as small circles on the graph.


The signal is the line of best fit that makes sense of the data by suggesting a possible relationship between the variables. Now, as you can see, not all the points lie on, or even close, to the line of best fit. This is because of noise — random fluctuations that affect any measurement process. Because Hubble and Slipher were pushing the envelope of available technology at the time, their measurements were unavoidably ‘noisy’, but they were still able to extract a signal, and that signal has been both confirmed and honed over the years.

In my experience, when the dread phrase “let’s look at the data” is uttered in education, the “search for a signal” barely extends beyond simplistic numerical comparisons: increase=doubleplus good, decrease=doubledoubleplus ungood.

The way we use currently use data in schools reminds me of SF author William Gibson’s coining of the term cyberspace (way back in the pre-internet 1980s) as the

consensual hallucination experienced daily by billions of legimate operators . . . a graphic representation of data abstracted from the banks of every computer in the human system
— William Gibson, Neuromancer (1984)

In my opinion, almost the whole statistical shebang associated with UK education, from the precipitous data-mountains of the likes of RAISEOnline (TM) to the humblest tracking spreadsheet for a department of one, is actually nothing more than a ‘consensual hallucination’.

The numbers, levels and grades mean something because we say they mean something. And sometimes, it is true, they can tell a story.

Let’s say a student has variable test scores in one subject over a few months: does this tell us something about the child’s actual learning, or about possible inconsistencies in the department’s assessment regime, or about the child’s teachers?

My point is that WE DON’T KNOW without cross referencing other sources of information and using — wait for it — professional judgement.

I believe that the search for a signal should be central to any examination of data, and that this is best done with a human brain through the lens of professional experience. And, given the inevitability of noise and uncertainty in any measurement process, with a generous number of grains of statistical salt.


Filed under Assessment, Education, Levels, Physics, Science, Society

Weasel Words in Education Part 5: Rigour

A crack team of DfE boffins test the proposed new system for the management and oversight of the United Kingdom’s increasingly fissiparous school system.

Rigour, n.

1. The quality of being extremely thorough and careful.

2. severity or strictness.

3. (when pluralized) harsh and demanding conditions

In education (as in other walks of life) the word rigour is usually meant in sense (1) when applied to one’s own thinking or the thinking of one’s friends or allies: “I am being rigorous. However, you, sir, are merely pedantic.”

These days, sense (2) seems to require the insertion of a prefix, as in “The moderation of our controlled assessments was over-rigorous.”

Rigour is therefore a good thing, right?

However, in my opinion it seems to be used more and more as a talisman rather than as a genuine description.

Mr Gove told the Commons: “The new specifications are more challenging, more ambitious and more rigorous. That means more extended writing in subjects like English and history, more testing of advanced problem-solving skills in mathematics and science.”

The Independent, July 2013

I am not sure if Michael Gove* is using the word in sense (1) or sense (2) here. If he meant it in sense (2) then it is a rhetorical flourish to emphasise the idea that GCSEs will be more challenging. If he meant it in sense (1) then the promise of “extended writing [and] more testing” doesn’t tell me how the new exams will be more thorough and careful. This is not saying that the examination system does not need to be more thorough and careful, merely that “extended writing [and] more testing” won’t necessarily make it so.

Let me emphasise that I am not opposed to rigour. I like rigour and being rigorous, at least in sense (1). I would perhaps favour the words consistent and fair rather than use rigour in sense (2) in an educational context, but that’s a personal preference.

In short, I wish people would be more rigorous in their use of the word rigorous. You shouldn’t just use it because you think it sounds good. A is rigorous while B is not should mean more than I like A and dislike B.

And as a final thought, I strongly suspect that many of the people who are most keen to bemoan the lack of rigour in education would have to step out of the kitchen when push came to shove, as in this little vignette:

[I listened] to magazine columnist Fred Barnes . . . whine on and on about the sorry state of American education, blaming the teachers and their evil union for why students are doing so poorly. “These kids don’t even know what The Iliad and The Odyssey are!” he bellowed, as the other panellists nodded in admiration at Fred’s noble lament.

The next morning I called Fred Barnes at his Washington office. “Fred,” I said, “tell me what The Iliad and The Odyssey are.”

He started hemming and hawing. “Well, they’re … uh … you know … uh … okay, fine, you got me—I don’t know what they’re about. Happy now?”

No, not really. You’re one of the top TV pundits in America, seen every week on your own show and plenty of others. You gladly hawk your “wisdom” to hundreds of thousands of unsuspecting citizens, gleefully scorning others for their ignorance.

— Michael Moore, Stupid White Men (2001), p.58


* His successor Nicky Morgan look set to continue Gove’s use of the term.

Postscript: For the those (including myself) who are classically undereducated: The Iliad is an ancient Greek epic poem by Homer about the Trojan War. The Odyssey is another epic poem by Homer recounting the ten-year journey home from the Trojan War made by Odysseus, the king of Ithaca.

1 Comment

Filed under Assessment, Education, Humour, Politics, Society