Category Archives: Assessment

Assessment Will Eat Itself

Seemingly a lifetime ago I remember writing about the worst mark scheme ever written. Jon Tomsett recently wrote a searing blogpost about a more recent version.

Laura then took me to her classroom, where piles of coursework were strewn across every table, and showed me what she has to mark. She has 29 students’ work to assess, having to write comments to justify her marks in 7 boxes for each student. That is 203 separate comments with minimal, if any, support from OCR. Page after page of assessment descriptors without any exemplar materials to help Laura, and her colleagues across the country, make accurate interpretations of what on earth the descriptors mean.

This is an example — pure and simple — of assessmentitis.

“-itis” is the correct medical suffix since the assessment system is, indeed, inflamed. Distended. Bloated. Swollen. Engorged. Puffed up.

How did it come to this? When you meet people who work for the examination boards, they are — by and large — pleasant, normal, well-adjusted and well-intentioned people, at least as far as I can judge. How can they produce such prolix monstrosities?

Dr Samuel Johnson made the telling observation that “Uniformity of practice seldom continues long without good reason.” The fact that all the exam boards tend to produce similar styles of document indicates that they are responding to a system or set of pressures that dictate such a response.

I suspect that, at its heart, the system has at least one commendable aim: that of fairness, and that of ensuring that everyone is making similar judgements.

In answer to the age-old question: “But who is to guard the guards themselves?” they have attempted to set up an impenetrable Wall of Words.

But here’s the thing: words can be slippery little things, capable of being interpreted in many different ways. Hence the need to add a comment to give an indication of how one interpreted the marking criteria. It has been suggested that “expected practice” (“best practice” to some) is to include phrases from the marking criteria in the comment on how one applied the marking criteria . . .

This is already an ever-decreasing-death-spiral of self-referential self-referring: assessment is eating itself!

Soon we will be asked to make comments on the comments. And then comments on the comments that we made commenting on how we applied the marking criteria.

But here’s another thing: if the guards are so busy completing paperwork explaining how they are meeting the criteria of competent guarding and establishing an audit-trail of proof of guarding-competencies — then, at least some of the time, they’re not actually guarding, are they?

Who is to guard the guards themselves? In the end, one has to depend on the guards to guard themselves. Choose them well, trust them, and try to instil a professional pride in the act of guarding in them.

Pride and honest professionalism: they are the ultimate Watchmen.



Filed under Assessment, Education, POAE

Markopalypse Now

AHT VAL: And once you’ve finished marking your students’ books and they have responded IN DETAIL to your DETAILED comments, you must take them in again and mark them a second time using a different coloured pen!

AHT HARVEY: A page that’s marked in only one colour is a useless page!

NQT BENJAMIN: Erm, if you say so. But why?

AHT VAL: It’s basic Ofsted-readiness, Benjamin. Without a clearly colour-coded dialogue between teacher and student, how can we prove that the student has made progress as a result of teacher feedback?

NQT BENJAMIN: But I’ve only got this red biro…


AHT HARVEY: In this school we wage a constant battle against teacher sloth and indifference!

(With apologies to The League Of Gentlemen)

I have been a teacher for more than 26 years and I tell you this: I have never marked as much or as often as I am now. We are in the throes of a Marking Apocalypse — a Markopalypse, if you will.

And why am I doing this? Have I had a Damascene-road conversion to the joy of rigorous triple marking?

No. I do it because I have to. I do it because of my school’s marking policy. More to the point, I do it because my school expends a great deal of time and energy checking that their staff is following the policy. And my school is not unique in this.

Actually, to be fair, I think my current school has the most nearly-sensible policy of the three schools I have worked in most recently, but it is still an onerous burden even for an experienced teacher who can take a number of time-saving short cuts in terms of lesson planning and preparation.

Many schools now include so-called “deep marking” or “triple marking” in their lists of “non-negotiables”, but there are at least two things that I think all teachers should know about these policies.

1. “We have to do deep/triple marking because of Ofsted”

No, actually you don’t. In 2016, Sean Harford (Ofsted National Director, Education) wrote:

[I]nspectors should not report on marking practice, or make judgements on it, other than whether it follows the school’s assessment policy. Inspectors will also not seek to attribute the degree of progress that pupils have made to marking that they might consider to be either effective or ineffective. Finally, inspectors will not make recommendations for improvement that involve marking, other than when the school’s marking/assessment policy is not being followed by a substantial proportion of teachers; this will then be an issue for the leadership and management to resolve.

2. “Students benefit from regular feedback”

Why yes, of course they do. But “feedback” does not necessarily equate to marking.

Hattie and Timperley write:

[F]eedback is conceptualized as information provided by an agent (e.g., teacher, peer, book, parent, self, experience) regarding aspects of one’s performance or understanding. A teacher or parent can provide corrective information, a peer can provide an alternative strategy, a book can provide information to clarify ideas, a parent can provide encouragement, and a learner can look up the answer to evaluate the correctness of a response. Feedback thus is a “consequence” of performance.

So a textbook, mark scheme or model answer can provide feedback. It does not have to be a paragraph written by the teacher and individualised for each student.

Daisy Christodoulo makes what I think is a telling point about the “typical” feedback paragraphs encouraged by many school policies:

[T]eachers end up writing out whole paragraphs at the end of a pupils’ piece of work: ‘Well done: you’ve displayed an emerging knowledge of the past, but in order to improve, you need to develop your knowledge of the past.’ These kind of comments are not very useful as feedback because whilst they may be accurate, they are not helpful. How is a pupil supposed to respond to such feedback? As Dylan Wiliam says, feedback like this is like telling an unsuccessful comedian that they need to be funnier.


Filed under Assessment, Education, Humour, Uncategorized

Look at the pretty pictures…

Uniformity of practice seldom continues long without good reason.

So opined the estimable Dr Johnson in 1775. In other words, if a thing is done in a certain way, and continues to be done in that same way for a number of years by many different people, then it is a pretty safe bet that there is a good reason for doing the thing that way. And this is true even when that reason is not immediately apparent.

For the choice of this situation there must have been some general reason, which the change of manners has left in obscurity.

— Samuel Johnson, A Journey To The Western Islands of Scotland (1775).

Consider the following examples of “uniformity of practice”:



They are fairly bog-standard GCSE examination questions from the last two years from three different exam boards. But compare and contrast with an O-level Physics paper from 1966:




The “uniformity of practice” that leaps out at me is that the more modern papers, as a rule, have many more illustrations than the older paper. Partly, of course, this is to do with technology. It would have been (presumably) vastly more expensive to include illustrations in the 1966 paper.

Even if we assume that the difficulty level of the questions in the modern and older papers are equivalent (and therein lies a really complex argument which I’m not going to get into), there is a vast difference in the norms of presentation. For example, the modern papers seems to eschew large blocks of dense, descriptive text; this extends to presenting the contextual information in the ultrasound question as a labelled diagram.

Now I’m not saying that this is automatically a good or a bad thing, but there does seem to be a notable “uniformity of practice” in the modern papers.

Now what could the “general reason” for this choice?

Rather than leave the “change of manners” responsible for the choice “in obscurity”, I will hazard a guess: the examiners know or suspect that many of their candidates will struggle with reading technical prose at GCSE level, and wish to provide visual cues in order for students to play “guess the context” games.

Now I’m not assigning blame or opprobrium on to the examiners here. If I was asked to design an exam paper for a wide range of abilities I might very well come up with a similar format myself.

But does it matter? Are we testing Physics or reading comprehension here?

My point would be that there can be an elegance and beauty in even the most arid scientific prose. At its best, scientific prose communicates complex ideas simply, accurately and concisely. It may seem sparse and dry at first glance, but that is only because it is designed to be efficient — irrelevancies have been ruthlessly excised. Specialised technical terms are used liberally, of course, but this is only because they serve to simplify rather than complicate the means of expression. 

Sometimes, “everyday language” serves to make communication less direct by reason of vagueness, ambivalence or circumlocution. You might care to read (say) one of Ernest Rutherford’s papers to see what I mean by good scientific prose.

The O-level paper provides, I think, a “beginner’s guide” to the world of scientific, technical prose. Whereas a modern question on falling objects might tack on the sentence “You may ignore the effects of air resistance” as an afterthought or caveat, the O-level paper uses the more concise phrase “a body falling freely” which includes that very concept.

To sum up, my concern is that in seeking to make things easier, we have actually ended up making things harder, and robbing students of an opportunity to experience clear, concise scientific communication.


Filed under Assessment, Education, Physics

The Gamesters of Sparta

Sir. It must be considered, that a man who only does what every one of the society to which he belongs would do, is not a dishonest man. In the republick of Sparta, it was agreed, that stealing was not dishonourable, if not discovered.

— Samuel Johnson

At a recent event, the speaker asked us to consider a hypothetical conundrum: what if one GCSE Triple Science student was strong in (say) Chemistry and Biology, but significantly weaker in GCSE Physics? 

What course of action would you recommend? Extra support in Physics, was the consensus reply. 

Actually, said the speaker, the smart “Progress 8 Maximisation Strategy” would be to:

  1. Tell the student to focus her efforts entirely on Biology and Chemistry and completely ignore Physics. . .
  2. . . . but keep her entered for GCSE Physics anyway, and make sure that she goes into the exam hall and writes her name on the Physics papers, even if she does nothing else.

That way, she has ostensibly followed a full and balanced curriculum. She has, after all, been entered for all three Science subjects.  And, since Progress 8 counts only the two highest Science grades (or so I’m told), the student’s contribution to the school’s league table position would be also be secure.

H’mm. Dishonest? No. In the school’s best interests? Definitely. In the student’s best interests? Erm . . . on balance, no.

Sadly, as the character Joseph Sisko (ably played by Brock Peters) once observed on Star Trek: Deep Space Nine: “There isn’t a test that’s been created that a smart man can’t find his way around!” And that includes Progress 8 . . .

Sir, I do not call a gamester a dishonest man; but I call him an unsocial man, an unprofitable man. Gaming is a mode of transferring property without producing any intermediate good. 

— Samuel Johnson


Filed under Assessment, Education, Society

They Wouldn’t Let It Lie: The Twelve Physics Pracs of Gove (Part 3)


One time a whole lot of the animals made up their minds they would go shares in building a house. There was old Brer Bear and Brer Fox and Brer Wolf and Brer Raccoon and Brer Possum — everyone right down to old Brer Mink. There was a whole crowd of them, and they set to work in building a house in less than no time.

Brer Rabbit was there too, of course, but he said it made his head swim to climb up the scaffold and build like the others, and he said too that he always got sun-stroke if he worked in the sun — but he got a measure, and he stuck a pencil behind his ear, and he went round measuring and marking, measuring and marking. He always looked so busy that all the other creatures said to each other that Brer Rabbit was doing a mighty lot of work.

And folk going along the road said that Brer Rabbit was doing more work than anyone. But really Brer Rabbit wasn’t doing anything much, and he might just as well have been lying by himself in the shade, fast asleep!

Brer Rabbit Gets A House by Enid Blyton

Of all the things that could be dumped overboard during a radical curriculum overhaul, the dreadful, unholy mess known variously as “controlled assessment” or “coursework” or “practical assessment” (but whose names are actually legion) would certainly get my vote.

So, I was actually faintly encouraged by the reformed A-levels insistence that students have to DO twelve “required” practicals, and that all schools have to provide is evidence that their students have DONE those practicals to allow a “practical endorsement” to be ticked on exam certificates. Assessment of students’ practical skills would be in the final examinations.

In my naivety, I thought that a set of properly written laboratory notebooks would be sufficient evidence for the practical endorsement to be awarded. I actually enjoyed explaining the protocols for keeping a lab book to our AS Physics group; for example, the idea that it should be a contemporaneous working document, replete with mistakes and crossings out — proper science in the raw, so to speak, warts and all. And not a suspiciously pristine, antiseptic and bowdlerised “neat” copy. And I thought our students responded gamely to the challenge, even down to worrying whether the pen with erasable ink counted as an ‘indelible pen” or not.

But, goddammit, the latest email from our exam board shows that the JCQ wouldn’t let it lie, they wouldn’t let it lie.

Now, we have to minutely “track” (dread word!) our students’ practical skills, verily even unto recording onto the Holy Spreadsheet if we have indubitable observational evidence of each student reading an Instruction sheet or not.

Oh deary deary me. It calls to mind Wilfred Owen’s memorable lines about being fit to “bear Field-Marshal God’s inspection”.

But it won’t be Field-Marshal God inspecting us. Instead, it will be a vast floppy-eared army of snaggletoothed practical-assessor-Brer Rabbits, hopping all over the land, measuring and marking, marking and measuring…

I give up: this seems to me like defeat, a return to the discredited and unlamented paradigm of controlled assessment. This is defeat, a totally avoidable defeat that has been snatched from the ravening jaws of victory…

The Twelve Physics Prac of Gove Part 1
The Twelve Physics Pracs of Gove Part 2
Bring Back POAE!


Filed under Assessment, Science

Educational Defeat Devices


The Volkswagen Emissions Test Defeat Device needs no introduction:

Full details of how [the defeat device] worked are sketchy, although the EPA has said that the engines had computer software that could sense test scenarios by monitoring speed, engine operation, air pressure and even the position of the steering wheel.
When the cars were operating under controlled laboratory conditions – which typically involve putting them on a stationary test rig – the device appears to have put the vehicle into a sort of safety mode in which the engine ran below normal power and performance. Once on the road, the engines switched out of this test mode.
The result? The engines emitted nitrogen oxide pollutants up to 40 times above what is allowed in the US.
BBC News 4/11/15

This perceptive post from cavmaths shows , I think, the danger of relying on widely used educational “best practice” short cuts. They can actually be deleterious to student understanding. In short, many of them are simply “educational defeat devices”, clever tricks designed to give a false impression of student performance under artificial test conditions, cheats that fall apart when tested in the real world.

1 Comment

Filed under Assessment, PIXL, Society

PIXL: panacea or poison?

“How the understanding is best conducted to the knowledge of science, by what steps it is to be led forwards in its pursuit, how it is to be cured of its defects, and habituated to new studies, has been the inquiry of many acute and learned men, whose observations I shall not either adopt or censure”.
–Samuel Johnson, The Rambler, April 1750

A colleague described a recent visit to a highly successful science department that has drunk mighty deep of the PIXL well. I shall summarise some of her observations and comments below. My reactions varied from intrigued to puzzled to horrified, but in keeping with the Johnson quote above, I shall endeavour to urge neither adoption nor censure — at least until I have thought about it some more.

Item the first: textbooks are forbidden. Students are taught from in-house PowerPoints and worksheets which are made available online for individual study by students. My colleague reported that she visited several classes in the same year group, and all the teachers were teaching the same topic with the same PowerPoint — and were often on exactly the same slide at the same time! Reportedly, this system was set up because science leaders were not satisfied with the quality of lessons being planned by individual teachers. For myself, I couldn’t help but be reminded of the Gaullist education minister who claimed to know which page of which textbook children throughout France would be studying on that very day . . .

Item the second: science leaders have exhaustively analysed the GCSE exam board specification to produce the materials mentioned above. Every learning point is translated into “student friendly” language and covered in detail. My information is that a typical starter activity might be for students to copy down a summary of important information from a PowerPoint, before practising application using worksheets and past paper questions. These are often peer marked. Since planning and resource making have been centralised, the workload of the classroom teacher appeared to be more manageable than in many schools.

Item the third: students are regularly tested. Test papers are gone over with a fine tooth comb by the science team and areas of weakness identified. These are addressed in large, multiclass study skills sessions led by the head of science in the assembly hall, teaching from the front (brave woman!) using an old fashioned OHP and transparencies! (Sigh! Now that takes me back: I can almost smell the banda machine solvent as we speak.) Students are sat at exam desks for the session, and the hall is supervised by teaching staff and SLT (including the headteacher on the day my colleague visited). This is followed by a “walk and talk” mock (i.e. the answer is modelled by the Head of Science on her trusty OHP), followed by individual exam practice under exam conditions.

And so we come to the question: shall we adopt or censure these observations?

The truth is: I am not sure.

On the one hand, I can see how this might be a rapid and effective way to improve results, especially in a school with an inexperienced science team. And the part of me that actually likes writing schemes of work and resources would relish the challenge of developing such a scheme. And I’m told that percentage science pass rates improved significantly from the low teens to the high eighties . . . over the course of a single year! And you can’t really argue with such success, can you? (Actually, yes you can — see this post on the Halo Effect) Also as Lt. Worf of the starship Enterprise once observed: “If winning isn’t important, why keep score?”

And yet . .

Part of me rebels at such regimentation. Is this an example of the “mcdonaldisation” of education, the continuing process of deskilling the classroom practitioner? I genuinely hate to say this, but given this model maybe Sir Ken Robinson has a point; although this particular iteration seems to owe more to Taylorism rather that the nineteenth century workhouse.

Use another teacher’s PowerPoint? Ugh! I’d rather be forced to use his toothbrush . . .

And, while I grant that many examination questions are indeed fit for purpose and thoughtfully designed to expose misunderstandings and misconceptions, I cannot help the feeling that our examination system has become an overly-powerful tail wagging an emaciated dog.

Is learning truly synonymous with exam success? Have we become so enamoured of the assessment of learning rather than learning itself that we, like the Scarecrow in The Wizard of Oz, would not consider ourselves truly learnèd unless we hold a diploma saying that we are?


Why, anybody can have a brain. That's a very mediocre commodity . . . great thinkers . . . think deep thoughts and with no more brains than you have. But they have one thing you haven't got: a diploma.

I shall leave the final word to my friend Sam Johnson:

“The great differences that disturb the peace of mankind are not about ends, but means. We have all the same general desires, but how those desires shall be accomplished will for ever be disputed.”
The Idler, December 1758


Filed under Assessment, Education, PIXL, Science, Society