What does a number mean? If someone gets 80 on the TOEFL iBT what does it actually mean? Is someone who gets 85 better that someone who got 80? Really? How?

What about in the classroom? You give students a score out of 10, 15, 20 etc, but what does the score really mean? What does good, very good or excellent mean? Do we really know what we mean? Do the students understand what the scores mean, or even these expressions?

What if scores were totally replaced by descriptions or descriptors? What if we gave a student a description of what they had achieved?

Are we giving students marks, because that’s the way it’s always been? Or may be this is also a reflection of our educational backgrounds as well as the undergraduate and graduate courses we’ve taken. After all, we probably also got scores or grades.

And what about the usefulness of assessment. What can we do to make students more aware and at the same time get them to do something about their learning?

It might require us as teachers to also take a closer look at what we mean, and/or what our students are really doing. But may be we just don’t have the time?

So there are a lot of questions here, a lot of elements, what now?

Well, what do you think? What do you do? How can we reinvent testing and assessment?

I’m hoping to add future posts to the questions above.

This is an interesting discussion – thanks for posting it.

Personally, I do think scores and numbers are useful, though they are generally only useful insofar as we really know what they refer to, and that what they refer to can actually be trusted – that’s a pretty obvious statement, I guess, but it does lead to some further areas to think about numbers-wise:

a) All tests, as we know, are subject to Standard Error of Measurement (SEM), so a candidate could well get 80 on one day and 85 on another day – after all, the points are not realy very far apart on the scale! But large testing organisations should know what SEM is from their piloting and field studies (and ideally, should publish it in their reports). This means that they must have confidence in the fact that where two different scores are issued (say 80 and 85), they are a pretty reliable representation of the real differential between the two candidates. A lot of success here lies in good piloting and field work.

b) To reduce SEM, it is possible to have a smaller number of ‘pegged’ marks that are far enough apart to be clear about where a candidate’s performance lies. Percentage scales are notoriously unreliable, because after all, what is really the difference between (say) 76 and 77? The marks are not far enough apartr to represent a clear difference in level.

c) Remember, too -, in order to work out whether person 1) has performed better than person 2) on different versions of a given test, you also need to know the Standard Deviation for that particular iteration of the test. What I mean is, if one test version is more difficult than another, then 80 on the more difficult test might be much better than 85 on the easier one!!! Of course, large testiong organisations will do a lot of work in this area to ensure parity between different test items that are intended to test the same construct.

Hope this is helpful as a general starter.

Gerard

Dear Gerard,

I feel you miss the point that is being made: numbers and scores only say how much students have learned or achieved relative to what the maximum score is that they might obtain, rather than what they have learned and how they might progress. And it should not be about what ‘we’ know what these scores refer to but it should be about how students can benefit from the feedback they get. It should be about assessment FOR learning rather than assessmen OF learning: the assessment itself should be a learning process for students. And numbers and scores simply do not offer this to students. Assessment should be about development: what have students achieved and how can they achieve the next developmental stage?

Hi Diana

Thanks for your comments. I think I agree with everything you have said. My only comment would be that I feel that summative assessment can be seen as ‘assessment for learning’ too. A lot of the work I have been doing in recent years has been to show how summative and formative assessment are inherently linked, with formative assessment being a very helpful tool in preparing students to approach summative assessments with a greater sense of confidence. Perhaps it is not so much a case of ‘assessment should be about’ but using all our resources to help our students to learn.

Best, Gerard

Gerard, thanks for posting this question. It is one I hear often. How much assessment should be graded? Which types? What means the most for students in knowing where they are in relationship to the goals? What if we used feedback – descriptive feedback that moved learning forward – and what if we gave students the opportunity to use that feedback right then and there? Marzano, Pickering and others say that clear learning targets, descriptive feedback and the opportunity to use it, make the biggest impact. What if the focus of educational assessment was to help kids close the gap or move beyond? How would we score for that purpose? I made some other comments here…

http://www.nwea.org/blog/2013/educational-assessment-without-numbers/

Figures (0-10, 0-20, 0-100); letters (A, B, C, … as in the ECTS system), words (“pass” or “fail”, “excellent, good, sufficiënt, …”), … All symbols summerizing a range of findings in one expression.

Just make a choice, depending on what you need and want to express. BUT make sure your student understands what it refers to. And if necessary (in most cases): add sufficient and clear comment.

http://library.iated.org/view/SIMOENS2011SAS

http://dugi-doc.udg.edu/bitstream/handle/10256/821/original_simoens.pdf?sequence=2

http://www.cplol.eu/congress2009_proceed/proceed/06-02_SIMOENS-Luc.pdf