In this series of posts I record my notes from Daisy Christodolou’s book “Making good progress? The future of Assessment for Learning” It is quite excellent. You can buy a copy here.
Improving summative assessments
The aim of summative assessments is for them to provide an accurate and shared meaning without becoming the model for every classroom activity. Rubrics of prose assessment statements are not particularly good at delivering reliability, and they can end up compromising the creative and original aspects of the task. Prose descriptors can be interpreted in many different ways. Judging in absolute terms is extremely difficult. Markers will overgrade and undergrade depending on the sample. We are much better at making comparative judgements than absolute ones.
very prescriptive rubrics end up stereotyping pupil’s responses to the task removing the justification for having them (grading creative and original work). Responses that are coached to meet the rubric pass and truly original work that doesn’t fails. Rubrics encourage coaching.
Comparative judgement offers the possibility of dropping rubrics by defining quality through exemplars not prose and by not relying on absolute judgement. It simply asks markers to make a series of paired judgements about responses. It relies on tacit knowledge of the subject expert – knowledge that is not easy to express in words.
Comparative judgement is criticised for offering little in the way of formative feedback. This is precisely the point. It decouples the grading process from the formative process. It allows classroom practice to be refocussed away from the rubric and towards helpful analyses of quality. One extremely useful resource that could be produced would be a set of annotated exemplar scripts.
Decisions about the difficulty and content of national summative exams are made by national exam boards. What if a school wants to summatively assess more frequently? To what extent can they be linked to the curriculum that the pupils are following? One solution is to outsource summative assessments, but there is still a gap between the remote standardised assessments like CEM and the formative assessments of classroom practice. It is not easy to create or interpret the results of school made curriculum-linked assessments. It can be difficult to tell if the test is difficult enough, or if it has the right spread of difficulty. Tests taken by small numbers of pupils don’t produce reliable grades. We can compare the results of teacher made tests to national assessments. The content studied over one term is simply not broad enough domain to sample from. Assessments have to sample from what pupils have learnt in that subject, not just in previous terms but in previous years.
A summative assessment can be linked to the curriculum and the most recent unit of study. However if a grade is awarded it will not be based solely on that unit and cannot be seen as reflecting performance on solely that unit. A student can make great strides with a unit but not be reflected on the summative unit as the assessment is not sensitive enough.
Summative assessments need to be far enough apart that pupils have the chance to improve on them meaningfully. However pupils will make relatively slow progress on the large domains that summative assessments are sampling. There are risks with using summative assessments too frequently.
Using scaled scores can overcome this to some extent. A scaled score converts raw marks which are not comparable (from different assessments) into ones that are. They show the continuum of achievement. Grades suggest that pupil performance falls into discrete categories when in fact it is continuous.