Geoff Petty - Graded lesson observations are dead in the water

(This article first appeared in inTuition autumn 2015)

G Petty December

A few years ago Ofsted turned its back on graded lesson observations, but many colleges persist with them. Geoff Petty looks at the evidence and the alternatives.

Ofsted has abandoned graded lesson observations as neither valid nor reliable, yet many colleges have not, despite research showing the grades are invalid and unreliable. Helpfully, Professor Robert Coe of Durham University has scrutinised the research on graded lesson observations. His critique probably influenced Ofsted’s decision, as his blog on this topic is startling. He concludes: “ Highly trained observers using the best methodologies can only tell an above average teacher from a below-average one, about 60% of the time.

If they tossed a coin it would be 50%. When untrained observers identify something as best practice, it often isn’t.” Professor Coe imagines consecutive Ofsted observations of the same teacher and says the data shows: An ‘outstanding’ grade will be downgraded by the next observation 75%t of the time. An ‘inadequate’ lesson will be upgraded by the second observation 90% of the time.

This leaves one to wonder why Ofsted and colleges have imposed grading on largely reluctant teachers for so long. Did Ofsted consult any research before adopting graded observations? If there wasn’t research did they commission some? Ofsted inspections have a tremendous impact on the reputations of providers and teachers who go to extraordinary lengths before, during and after inspection visits to ensure that their teaching and learning is fairly represented and, when required, that they learn from Ofsted’s feedback.

Ofsted’s decision to abandon graded lesson observation is very welcome. It will save a lot of jitters and nail-biting, but one wonders why, with so much at stake, it took the inspectorate so long to make this change, when there was plenty of more reliable data to inform their judgements.

The irony is that colleges are swamped with data that is much more reliable and valid: student achievement and retention rates, value-added data, grade profiles, student satisfaction surveys and, especially, reports that consider these together, for the same teacher or course and how it changes over time. Even this needs to be interpreted with caution, though.

The other irony is that discovering weak teachers and courses, then improving them, is not as effective as expecting all teachers and courses to improve. Why embed complacency into your quality system? The damage caused by grading is more than that caused by dodgy data. When teachers know they are to be graded they inevitably try to guess what the observer will be looking for and often try to teach in this way.

They stop asking themselves: ‘What is good teaching?’ and start to ask: ‘What are they looking for?’ In doing so, the search for excellence is replaced by a largely futile attempt to guess what is in observers’ heads and to remember checklists, which we now know to be highly variable and often wrong. Teachers become extrinsically motivated instead of intrinsically motivated. There is a century of research that shows that extrinsic motivation reduces creativity, and lowers standards in complex tasks, (Daniel Pink, 2011).

And instead of grading? As I wrote in 2013: “If a teacher is underperforming, it is not because they have a battalion of outstanding teaching methods they are not prepared to use until they are sufficiently threatened. It’s because there is a knowledge and skill deficit. So the cure is learning, not grading.”

For decades I asked advanced practitioners (APs) from hundreds of institutions what worked best in their institution to improve teaching and learning. To my delight many of them alighted on an approach advocated by the two major research reviews on effective continuing professional development – Timperley (2007) and Joyce and Showers (2002) – despite not knowing about these studies.

Other approaches advocated by APs included: ungraded lesson observations and ‘observing to learn’ where all teachers observe lessons, not to judge them, but to try to learn how their own teaching could be improved. However, observation didn’t come out trumps. Other approaches, such as videoing lessons and then discussing parts of the video with coaches came out well. The ‘supported experiments’ approach was highly prized too, after all this is what the research reviews advocate and that is our most reliable source of evidence for what works best in teacher improvement.

Lesson observations – graded or otherwise – are not advocated by research reviews on how to improve teaching, indeed there are warnings against them. It’s great that Ofsted has moved away from graded observations and embraced the use of evidence. I hope many providers do the same.



Professor Robert Coe ‘Classroom Observation: It’s Harder Than You Think’

Helen Timperley et al (2007) “Teacher Professional Learning And Development” Best-evidence synthesis iteration

Joyce and Showers (2002) ‘Student Achievement Through Staff Development’ 3rd ed

Daniel H. Pink (2011) “Drive: The Surprising Truth About What Motivates Us”