Categories
Coordination

Summary of Assessment Principles and Practices

Originally posted on February 2, 2019 @ 9:00 am

Last term the IB published a new document – “Assessment principles and practices – Quality assessments in a digital age”. Below I post my summary and notes on the document.

Introduction

The aim of this document is to explain the principles that the IB has adopted to make sure that assessment is meaningful, fair and in the best interests of the students involved.

All assessments are a balance between conflicting demands and many concerns about testing fail to take this into account. An example is the tension between reducing the assessments burden and the risk of candidates only having one opportunity to show what they can do.

The IB aims to be a holistic programme of study and this should be reflected in the assessments. Decisions should therefore focus on the impact of the overall programme, not just on one subject, discipline or assessments. The IB focusses on what it is important to assess and not what is easy to assess.

Assessment is all about balancing conflicting and competing demands. Poor quality asssessments will lead to poor quality outcomes, even if assessment is “only” formative.

Assessment principles are what the IB thinks are important in creating qualifications and assessments. They come from what is considered important about an IB education. Assessments should support education, not distort it.

Assessment practices are the ways in which the principles are delivered in a meangingful and practical way.

IB exams should represent an opportunity for candidates to show what they understand, rather than being a unique experience which they need to master. Technology therefore should be driven by the assessment needs and not the other way around.

The IB is exploring eAssessment as a way to ensure that students are able to genuinely demonstrate what they know. In this way, assessments will seek to utilise technology where it may make the assessment more authentic. The IB has already moved to eMarking for many DP components and will seek to implement eAssessment in the DP, although it aware of the risks:

  1. Burdens on schools
  2. Risk of failure
  3. Security
  4. Tech for the sake of Tech
  5. Bias against certain groups of students and device effects
  6. Changing standards
  7. Barriers to schools offering IB programs

Assessment

Assessment can mean many different things but generally is divided into summative and formative although today there is a drive towards “assessment as learning”. Assessments need to be designed carefully to meet the purposes its results are used for. Excellent formative assessments may be poor summative assessments (See here).

Assessment can mean any of the different ways in which student achievement can be gathered and evaluated and there are different assessment models, namely the compensation model and mastery model. The mastery model requires a basic minimum in all criteria to be met, whilst the compensation model will allow poor performance in one criteria to be balanced by very good performance in another criteria.

Formative assessments aim to provide detailed feedback to teachers and students about the students strengths and weaknesses. In contrast summative assessments focus on measuring what the student can do at a specific time. Summative assessment seeks to make a judgement about a candidate, not inform future teaching and learning.

The balance between measuring achievement and identifying correctly what still needs to develop is called assessment validity. It is important to note that the balance between quality of feedback and attainment is opposite in formative vs summative assessment.

Different national systems have adopted different approaches to assessment and these reflect the tensions between the wider aims of the society, the time and resources available. There is no one size fits all or perfectly optimal assessment system. Additionally summative assessment is increasingly being used to analyse teaching quality.

The backwash effect is the influence that an assessment has on the teaching of the content. This can be positive or negative. Snyder’s hidden curriculum is the meaning that students create about a discipline based on this assessment tasks. Assessment needs to be designed around constructivist learning theory.

Marks and grades are not the same thing. Marks refer to credit given to a candidate in line with a mark scheme, and has no other meaning. Grades describe the quality of a candidates work.

Generally IB assessments are not norm referenced. The IB generally uses marks as an indication of overall performance and then looks at how well candidates with x marks performed matched to the grade level descriptors. They then place boundaries based on the descriptors.

Validity means asking if an assessment is fit for purpose. The IB’s first concern is whether the programme is valid, then whether the elements of the programme are valid and finally whether assessments are valid. Validity is not an objective concept and is a balance between competing issues. We cannot prove validity but can construct a validity argument based on evidence.

eAssessment offers new opportunities for interaction within exams and therefore improves the validity of some aspects of assessment. It also removes some of the security concerns while introducing others.

Validity chains can be used to think about validity. There are five elements to the chain:

  • Reliability
  • Construct relevance
  • Manageability
  • Fairness
  • Comparability

All of the above are necessary to achieve validity but there are also tensions between each of them. The IB places the most value in construct relevance – assessments that actually test what they intend to test, but not at the exclusion of all else.

Reliability is the extent to which a candidate would get the same test result if the testing procedure was repeated. There are several sources of unreliability. Consistent outcomes are not the same as the right outcome. The aim of marking reliability is to ensure that all examiners make the same judgement as the senior examiner.

Construct relevance is concerned with accurately measuring the thing that the assessment is attempting to measure.

Manageability can be discussed in terms of the candidate, the school and the IB.

Fairness and bias is concerned with ensuring that the test does not give an advantage to one group over another. Bias can arise from:

  • The delivery of the assessment
  • Bias arising from marking
  • Bias related to assessment questions

Comparability of assessment is concerned with how the grades from assessments can be compared between years or subjects. The IB seeks to maintain three principles about comparability:

  1. The standard of work to achieve grades within a subject or discipline is comparable between years.
  2. Grades between subjects have a consistent meaning so that different routes to achieve the program award are comparable.
  3. Although the IB aims to focus on higher order skills, IB assessments are broadly comparable with similar exams offered by individual nations or other awarding bodies.

IB’s approach to validity

The IB believes that construct relevance and authentic assessment are more important than maximising reliability. The IB believes in rounded, holistic education. Its priority is for strong arguments of validity at programme level. Validity is a complex and multi-faceted balancing act and there is not single right answer, where you place the balance is ultimately a judgement based on the value of the organisation that is developing the assessments. The IB aims to do more than other curricula by developing inquiring, knowledgable and caring young people who are motivated to succeed.  We need to consider how the aims of individual subjects fit into the holistic aims of the IB.

IB assessments are weakly criterion referenced. That is candidate performance is matched against behavioural descriptors.

Comparative marking represents an alternative to traditional marking. The basis is that the human mind is better at making comparisons than absolute judgements. In these examples examiners make win/lose comparisons between pieces of work. In subjective marking mathematics, combined with an importance statement will allow a team of examiners to compare and “mark” students work. However comparative judgement requires many marking decisions because each piece of work must be looked at several times by several examiners.

For the IB the underlying principle is to test what is important and assessments should encourage good teaching. Comments on summative work are used to support examiner marking. Comments on formative work are give feedback to learners.

IB programme-specific processes

Key elements that link all IB programmes are:

  • The learner profile
  • Approaches to teaching and learning
  • international mindedness

IB programmes are conceptual, that is, they focus on powerful organizing ideas that are relevant across subject areas and that help to integrate learning and add coherence to the curriculum. We need to consider how are assessments within each program meet the broader objectives of the DP.

ATLs

The IB recognises the need for schools and individual teachers to have the space to be creative and the ATLs are suggested as guidance and to help highlight good practice and enable discussion. The ATLs are not meant to be prescriptive. Skills can only be improved over time and if taught in a sustained fashion.

Notes

There is definitely some new information here that I am happy to receive. I was not aware that teachers could be observers at grade awarding meetings or at the final awarding committee, and this is something that I would definitely pursue in the future or recommend my teachers do. I was also pleased to see the section on ATLs and the nod from the IB that these are not mean’t to be prescriptive and that schools and teachers should be free to be creative.

Categories
Education

Post 100: Learning in the age of Humanism

Originally posted on December 1, 2018 @ 10:30 am

In the UK, prior to the Renaissance, learning in schools mainly consisted of the trivium of grammar, rhetoric, and logic and the quadrivium of music, astronomy, arithmetic and geometry. Pupils were taught these to prepare them to progress on to the study of theology, law and medicine and education primarily served the church and state.

Following this, the ideas of humanism, the religion that replaces god with the experience of mankind, began to take their root in the philosophy of education, leading to the idea that education is a right for all and that is should be administered by the state to ensure that that right is gained by all. In the 19th century the three great humanist projects of Communism, Facism and Liberalism were born and in some way each has contributed to the modern humanist education agenda. In this century we are at the zenith of the age of humanism, with even modern religions adopting a humanistic outlook, according to the work of Noah Harari.

Modern schools and the modern western education system were built on the principles of humanism and many of the ideas in education make humanist assumptions about the individual. For example the roots of progressive education are based on post-romantic ideas about the beauty of individual experience, an idea that is also the underpinning of capitalism and democracy; two ideas that are practical manifestations of the philosophy of liberalism. On the other hand ideas within the traditional approach to education appear to be rooted in the idea that all humans are created equal; an idea that has its roots in Communism, another humanist tradition.

Schools today are very much humanist institutions. But many educators today are confused on this front. Hence some fight against factory schooling, using the factory as a metaphor for an education system that is dehumanised. It was humanist ideas that led to an education system that prepared indivduals for an industrial society. It was humanist ideas that demanded that everyone had the right to an education. It was humanism that led to the organisation of state run schools. Factory-like in that they were exposing lots of different people to ideas that generations before had not had the luck or rights to access. Factory-like, in that this is an efficient model of mass production. Mass production of more (compared to their parents) knowledgable citizens. The irony here shouldn’t be lost on you. It’s ironic that it was ideas tied to liberalism that made the industrial revolution possible (Governments and Businesses needed to care about the individual in a way that they just didn’t need to before) now serve as a metaphor as to why schools are perceived by some as being dehumanising.

Really, the confusion here is due to a misunderstanding of history and ideas. Educators who use the factory as a metaphor don’t understand their history and the world view is informed more by liberal humanism the communist or fascist humanism.

Educators also get their modern politics confused. It surprises some that teachers who self-identify as being “trad” are labour voters. See this blog. They mistakenly think that anyone who advocates for traditional teaching must be a conservative. In his book Why Knowledge Matters E.D. Hirsch, draws a distinction between, community and knowledge centered education versus individual and skills centered education, arguing that the former acts to reduce inequality, while the latter cherishes the individual experience of the learner. Both are fundamentally humanist in their paradigm.

If the aim of traditional education is to reduce inequality by ensuring that ALL students receive the same knowledge centered education then it is easy to see why this is left leaning idea. Traditionally the left leans towards communist humanism, valuing equality over freedom, while the right leans towards liberal humanism valuing  individual freedom over equality. That is, of course, an oversimplification but it roughly works.

If you spend anytime on edutwitter or reading about educational history you won’t have missed this tension at the heart of the philosophy of education. This debate is often personal and vitriolic. Writers come down hard on each other, often forgetting that their opponent want the same thing as them: the best outcomes for the individual. In this sense both of these opponents are humanist and believe in the humanist agenda. But a bit like the schisms in the Christian church that saw Catholics and Protestants burn each other at the stake over their differing interpretations of the same creed, modern day “philosophers” of education are happy to figuratively hang, draw and quarter their opponents for having slightly different views. Understanding this should be the basis of finding points of convergence in a debate.

The modern drive for personalised education is a manifestation of the principles of humanist liberalism and, in its present form, appears to conflict with the ideas of a community-centered education system whose aim is to reduce inequality. To my mind it is educational equivalent of “organic” farming, it market’s itself very well but is hard to scale up and is incredibly inefficient in its resource consumption.

In my next post I want to explore how personalised education may fare as we move deeper into the 21st century. Everyone assumes that this is the way we are going as an education system, but is it. In his books Harari writes about Dataism, and how this new philosophy may end up replacing humanism. I am interested in thinking about what this may mean for the modern education system.

Categories
Education

Learning in the age of Dataism

Originally posted on December 8, 2018 @ 8:15 am

Last week, I outlined in this post how modern schools and the modern western education system are built on the principles of humanism, showing how many of the modern assumptions in education are based on humanist claims. For example the roots of what is termed progressive education are based on post-romantic ideas about the beauty of individual experience, an idea that is also the underpinning of capitalism and democracy; two ideas that are practical manifestations liberalism, a creed of humanism. On the other hand ideas within the modern traditional approach to education are rooted in the idea that all humans are created equal and as such have rights to an equal education; an idea that has its roots in Communism, another humanist tradition. In this post I want to explore how the ideas of Noah Harari, written in Homo deus, may relate to the modern education system.

Harari contends that society at large may soon be leaving the age of humanism and entering an age of dataism. Dataism is defined as an emerging ideology in which information flow is seen as the supreme value. It is used to describe the mindset created by the emerging significance of big data.

Harari goes on to argue that Dataism, like any other religion, has practical commandments. A Dataist should want to “maximise dataflow by connecting to more and more media”[8], and believes that freedom of information is “the greatest good of all”. Harari also argues that Aaron Swartz, who took his life in 2013 after being prosecuted for releasing hundreds of thousands of scientific papers from the JSTOR archive online for free, could be called the “first martyr” of Dataism

Big data is only just beginning to make inroads in education and yet already we are seeing its potential. One of the areas that I have begun to witness this is in university guidance. All ready platforms are springing up that aim to utilize big data to help students make sense of the options available to them. Companies like BridgeU use algorithms to help locate universities and courses based on student preferences. Information flow here is already quickly becoming the supreme value and will allow individuals to make slicker, more efficient, choices, perhaps with a lot of time saved to boot. On one hand, this appears to be at odds with the individual focus of human university counseling. Indeed some colleagues have told me that they prefer the platforms, like Unifrog, that have less of a data driven, algorithmic, inhuman agenda. But I actually think that these systems have the potential to improve the lot of many individuals by freeing up time and making research on an overwhelming range of options more focussed. Sometimes less is more.

Elsewhere, big data is behind standardised testing programmes like those administered by CEM and the success of many new pedagogical applications that aim to help to apply scientific findings of learning to course material like the textbooks developed by Kognity.

On the face of it, information flow as a supreme value could seem to be at odds with the current paradigm of education: humanism. Isn’t it the love of more data that can drive poor interventions in schools, like when schools require more frequent testing and measurement of progress, so that students and teachers alike are pressured into making decisions to maximise progress? Often these interventions come at the demise of learning, causing poor behaviours like teaching to the test and cramming. Or how about when teacher spend hours agonising over data collection and data entry instead of planning the next teaching sequence?

Schools love data. Whether it is data on student progress or on student performance, teachers and senior managers love it. However, in some ways many schools have gone through the dataism paradigm already, and come out the other side. The debate, in the UK at least, is shifting away from purely data driven measures of progress to measuring quality of education, by including curriculum measures, something far more qualitative than quantitive. While schools love data they are also fundamentally humanistic in nature. Whatever the motivations of educators have been historically, education is an affair of the human experience. It is a someone who is being educated. It is a someone who is having their mind and thinking patterns altered in someway.

In a sense, though, knowing is data flow. While information and knowledge are not the same thing, knowing relies on using information. It is how the mind works with information that results in knowing. And so, if society were to progress into an age where data would reign as the supreme good, education would still be able to maintain its focus on the individual experience.

Harari analyses human history through the idea of dataflow, the new emerging modern value. He  writes that “the crippling thing about religion is that it reduces data flow“. By this he means that modern religions, are not innovators in religious experience. They are conservative and restricted by holding onto old and outdated traditions and values. I know this first hand. Growing up in my evangelical family I was not allowed to read books by David Hume because he was a “humanist”. This conservatism restricts knowledge transfer to and development within the individual, thus from a dataist point of view: religion restricts dataflow.

Schools, then, could well be institutions that, through learning of knowledge, promote dataflow. Education could develop a dual purpose, the traditional humanist purpose of educating the individual for their benefit or societies, and the modern purpose of ensuring that dataflow is maximised and made efficient through the education of individuals as producers and consumers of data.

If this is right, perhaps this is more cause to be concerned with the fad of 21st century skills. In this previous post I partially outlined why there is no such thing as generic skill acquisition. Real 21st century skills will require lateral thinking, stress management and an ability to see the big picture as well as know the details and have expertise. All of this requires knowing. The more you know, the easier it is to learn more. Yes, humans may never be able to outcompete computers for what they can know but that doesn’t mean we should give up learning. Actually we need to give more careful thought to what our students do know.

One of the areas where dataism and big data may well come to play a larger role in the education system and also benefit the liberal philosophy of many schools is through personalised learning. The ability of big data, algorithms and artificial intelligence to tailor learning experiences to individual needs to is already being seen in a plethora of learning applications and websites that are adaptive in formative assessment and tailoring of content. I imagine that we will see a lot more of these tools becoming available in mainstream schools as we progress through this century.

Categories
Teaching & Learning

Developing a progression model for IBDP biology

Originally posted on November 24, 2018 @ 4:00 pm

I recently completed Daisy Christodolou’s “Making good progress?”. You can see my notes here. In the final chapters, after presenting an argument building up to this, she outlines the key aspects of what she terms a “progression model”. In this post I want to line up some ideas about what this may look like in delivering the IB DP Biology course.

In her book Christodoulou suggests, and I agree, that to effectively help students make progress we have to break down the skills required to be successful in the final assessments into sub-skills and practice these. This is a bit analogous to a football team practicing dribbling, striking or defending in order to make progress in the main game.

In the book she also stresses the difference between formative and summative assessments, what they can and can’t be used for respectively and why one assessment can’t necessarily be used for both.

A progression model for biology

A progression model would clearly map out how to get from the start to the finish of any given course, and make progress in mastering the skills and concepts associated with that domain. In order to do this we need to think carefully about:

  1. What are the key skills being assessed in the final summative tasks (don’t forget that language or maths skills might be a large component of this)?
  2. What sub-components make up these skills?
  3. What tasks can be designed to appropriately formatively assess the development of these sub-skills or, in other words, What does deliberate practice look like in biology?
  4. What would be our formative item bank?
  5. What could be our standardised assessment bank?
  6. What are appropriate summative assessment tasks throughout that would allow us to measure progress throughout the course?
  7. What could be our summative item bank?
  8. How often should progress to the final summative task be measured i.e. how often should we set summative assessments in an academic year that track progress?

Key skills in biology

This is quite a tricky concept to pin down in biology specifically and in the sciences in general. What skills exactly are kids being assessed on in those final summative IGCSE or IBDP/A Level exams. I haven’t done a thorough literature review here so currently I am not sure what previous work has been in this area.

However,  I would contend that most final written summative exams are assessing students conceptual understanding of the domain. If this is the case then the skill is really, thinking and understanding about and with the material of the domain. Students who have a deeper understanding of the links between concepts are likely to do better.

In addition, those courses with a practical component, like the IBDP group 4 internal assessment are assessing a students understanding of the scientific process. While it may seem like these components are assessing practical skills per se, they only do this indirectly, as it is the actual written report that is assessed and moderated. To do well the student is actually demonstrating an understanding of the process, regardless of where their practical skills are in terms of development.

Indeed if we look at the assessment objectives of IBDP biology we see that this is very much the case. Students are assessed on their ability to: demonstrate knowledge and understanding and apply that understanding of facts, concepts and terminology; methodologies and communication in science etc.

Sub-skills

How can we move students to a place where they can competently demonstrate knowledge and understanding, apply that understanding as well as formulate, analyse and evaluate aspects of the scientific method and communication.

The literature on the psychology of learning would suggest breaking down these skills into their subcomponents. This means we need to look at methods that develop knowledge and understanding from knowledge. Organising our units in ways that help students see the bigger concepts and connections between concepts within the domain will also help. For more on this see my previous post here. I think that understanding develops from knowledge.

I recently read that Thomas Khun claimed that expertise in science was achieved by the studying of exemplars. Scientific experts are experts because they have learned to draw the general concepts of the specific examples.

Useful sub-skills would be:

  • Fluency with the terminology of the domain
  • Ability to read graphs and data
  • Explicit knowledge of very specific examples
  • Explicit knowledge of abstract concepts illustrated by the specific examples
  • Ability to generate hypothesis and construct controlled experiments

Deliberate practice in biology

Thinking about these sub-skills, then, we can see what may constitute deliberate practice in biology and thus what would make useful formative assessments within the subject.

Fluency with the terminology can be gained through the studying of terminology decks like those available on quizlet. In addition, the work of Isabel Beck. Suggests that learning words isolated from text is not that helpful to gaining an understanding of those terms. To gain this, students need to be exposed to these words in context. Therefore there is a lot to be said for tasks and formative assessments that get students reading. Formative assessments could then consist of vocab tests and reading comprehension exercises of selected texts.

Reading and interpreting data can be improved through practice of these skills. This is an area where inquiry alone won’t help students make progress. Students need to be shown how to interpret data and read tables and graphs before making judgements. Ideally, in my opinion they should do this once they have learned the relevant factual knowledge of a related topic. Formative assessments focussing on data interpretation should therefore come a little later once students have covered a bulk of the content.

To build up conceptual understanding, students need to be exposed to specific examples related to those topics as I outlined in this post. Tests (MCQs) that assess how well students know the specific details of an example could be useful here to guide learners to which parts they know and those they don’t.

Following this we can begin to link examples together to build knowledge of a more abstract concept. Concepts can then be knitted together to develop the domain specific thinking skills: thinking like a biologist.

Formative assessments

Formative assessments could take the form of MCQs but as outlined above, vocab tests, reading comprehension activities, and other tasks may well have their place here.

Summative assessments for measuring progress

I am now thinking that to truly assess student progress against the domain, individual unit tests just won’t cut it. As Christodolou argues, summative tests exists to create shared meaning and do that need to be valid and reliable. Does scoring a 7 in a unit test on one topic of an 11 topic syllabus mean that the student is on track to score a 7? Not necessarily. Not only is the unit test not comparable to the IB 7 because it is only sampling a tiny portion of the full domain, but the construction and administration of the test may not be as rigorous as that of the actual IB papers.

Clearly it isn’t ideal to use the formative assessments described above as these are nothing like the final summative assessment of the course, plus their purpose is to guide teaching and learning, not to measure progress.

I would argue that summative assessments over the two-year course should use entire past papers. These past papers sample the entire domain of the course and performance against them is the best method of progress in the domain. A past paper could be administered right at the start of the course to establish a base line. Subsequent, infrequent, summative tests, also composed of past papers could then measure progress against this baseline.

Why should summative assessments use past papers? What not use unit tests? Unit tests, aggregated, is not the same thing as performance on a single assessment sampling the whole domain. They cannot produce the same shared meaning as an assessment that samples the entire domain. In addition the use of many single unit, high stakes tests will cause teaching to the test as well as much more student anxiety. Instead lots of formative testing and practice of recall should help to build students confidence in themselves.

Categories
Education

Undervalued & under-taught: concepts missing in teacher education

Originally posted on November 19, 2018 @ 8:00 am

Friere, Piaget and Vygotsky are the usual suspects in the theory that underpins many initial teacher training courses, at least in my experience; I am happy to be corrected. The theories of these men, while useful and, in parts, necessary are often presented as the ground truths of teaching and learning or as outright fact.

Over the past few years I have picked up a little bit of knowledge about certain concepts that, if not completely debunking many of these operational fact-theories, certainly voice a challenge to them and I think it would do a lot to develop educators own critical thinking faculties if these concepts were taught alongside the main teacher training dogma.

Many of these ideas I have been exposed to through my own semi-self directed reading about education. I write semi-self directed because although I am choosing which books to read and when, I rely on the recommendations of colleagues and the educators that I follow on twitter.

While each of these, on their own may not be threshold concepts as such (if such a thing exists) learning about them has had a developmental effect on my thinking as an educator.

In my thirties, I can now begin to trace back my own intellectual interests and growth of knowledge. Originally, I was only interested in biology and things directly related to that field. Working as a teacher, my interest in this subject prompted me to develop my knowledge of neuroscience, among others. From here I developed an interest in the teenage brain and then neuromyths.

During my PGCE top-up, it was clear that subjective research processes were held to be just as valid as objective research methods. I challenged some of the ideas fed to us about subjective & experienced based research, arguing that evidence needs to be as objective as possible. My ideas were met with some scepticism, but I went ahead and tried to summarise some of the work on educational neuroscience and to do some sort of quantifiable research on teachers understanding of neuromyths.

Despite the lack of rigour and balanced curriculum, topping up my GTP to a PGCE was worth it. I wanted to do the PGCE because I felt my GTP had not had any academic focus and I didn’t like the fact that I didn’t know much about the theory behind what I was being told to do in the classroom. My PGCE served to get me academically engaged with the educational theory and it is only since I completed it that I have continued to maintain that engagement.

My interest in this area hasn’t abated but as I learn more it has become more nuanced. I agree that we need to be careful interpreting the results of much cognitive research but I do think that it offers that power to help guide us to what may better versus worse pedagogical techniques. They may well help us hone our pedagogical content knowledge.

Currently, these are the ideas that I believe that all teachers should have some training on (in no particular order):

Where can you go to get more valuable knowledge on these concepts? Here are some of the resources that I would recommend:

The Education Endowment Foundation

Daniel Willingham’s blog

The Education Development Trust

Core Knowledge Curriculum

The Learning Scientists

The Learning Spy