SCD Education Update posted an article, Are Grades Reliable? Lessons from a Century of Research, by Susan M. Brookhart and Thomas R. Guskey. I have always valued both authors’ insights regarding assessment, grades, and feedback.
“But no amount of reform in grading policies or report card formats will improve grading if the grades reported are not reliable.”
— Brookhart and Guskey
After highlighting a century of studies that illustrate the unreliability of grades, they offer some key consideration for bringing about improvement:
- Clarify criteria. For me, this means being able to clearly illustrate or describe what mastery of the skill or standard looks like and sounds like. Having defined mastery; teachers could then agree on, or process with students, what above standard criteria would entail. I have worked with some teachers who leave the highest level of a rubric blank asking students to create it. Below standard should mean being able to identify missing elements.
- Be consistent. I recall working with a fourth grade PLC that used a common assessment for checking students’ understanding of main idea. After reading a three -paragraph selection, students were asked to complete a graphic, writing the main idea in an oval and placing four supporting details on rays extending from the oval. I asked teachers to sort student work into three piles: Mastery; Not Mastery; and “I’m not sure.” I was intending to focus the conversation on the “I’m not sure” group. However, when teachers passed their mastered stack to their colleagues, disagreement emerged, and we spent the rest of the time and the next meeting getting clearer on mastery. I recall the amazement when one teacher defended her students, “Don’t blame them. That’s how I taught it”. Clarity on what mastery means is a critical guide for instruction.
- Use simple scales with a few distinct categories. The authors clarify the difference between being able to agree upon criteria on a Pass/Fail assessment vs a 1-4 scale or a 100% point system. How clearly can the teacher identify the performance difference between a 92 and a 91?
The authors reinforced for me the challenge of grade reliability and the role that Professional Learning Communities can play. Reliability of grades cannot be accomplished with teachers exploring standards, planning for learning, assessing, and assigning a grade in isolation. We must create time and processes for teachers to engage in this important collaborative work.
I shared the Brookhart and Guskey article with a high school administrative team that is frequently challenged by students and parents concerning substantial differences in teachers’ course requirements and grading practices. I suggested they challenge teachers in same course PLCs to identify the current consistency that exists and then plan to increase it across the year ending with a presentation in March/April regarding what they discovered, learned, applied and the impact that it had on teaching and learning.
I suggested they seek consistency in:
- Identifying elements to be mastered
- Measuring mastery
- Assigning a grade
This process would require spending time with each other’s student work and performances. Time and vulnerability are necessary to develop meaningful consistency. Teachers should be advised that this is a messy process. An attempt to quickly “complete the task” is unlikely to produce learning for teachers or students. Healthy debate and disagreement would be an initial good sign. Several iterations, each improving on the last, will likely be needed. Changes in curriculum, students, and teachers will require this, in some fashion, to be an ongoing process each year.
Here are some activities that could be implemented with teachers in a department who teach sections of the same course and meet as a PLC.
- On the next common assessment, each teacher would identify a student in his/her class that scored just above the level of proficient and one that scored just below. They would present these along with their explanations to each other.
- On the next common assessment each teacher would send a copy of an assessment that was just proficient and one just below to each other without identifying which was which. They would come to the next meeting having identified the proficient and non-proficient assessment with their reasoning and compare with colleagues’ decisions.
- Following the end of unit assessment bring two A-,B+,B-,C+ papers to the PLC session for comparison.
Such activities should generate considerable conversation, debate, questions and insights. Teacher learning should generate ongoing modifications that should lead to increased reliability and increased student learning.