There are two main arguments against using standardized tests to guarantee that students reach at least a basic level of academic competency. The first is radical: These tests are not necessary. The second—less radical and more familiar—is that, even if standardized testing were an efficient benchmark of basic skills, the costs associated with it are too high.
Standardized tests are unnecessary because they rarely show what we don’t already know. Ask any teacher and she can tell you which students can read and write. That telling usually comes in the form of letter grades or evaluations that break down progress on skills. So trust the teacher. Publish grade distributions. Locally publish a compilation of evaluation reports. Release a state or national report reviewed and verified by expert evaluators with legislative oversight.
People will say: “That’s crazy! Schools will fudge results. Grade data means nothing because teachers apply different standards with different values. Let’s give them all one reliable test. And won’t this proposal create a whole new bureaucracy?”
All true (except for the one test being reliable). Given high stakes and the accompanying pressure, people will game a system. And it is all too true that grades vary widely because of four factors: a teacher’s conception of achievement, a teacher’s sense of equity and rigor, a teacher’s ability, and the composition of students.
But people are already gaming standardized testing, sometimes criminally. And, at a basic level of competency, a grade or an evaluative report would give us as much information as we now get from standardized tests.
We have the grade problem at my high school. In the same course or department, a B in one classroom might be an A, or even a C, in another. It’s a problem for us, and, likely, a problem in most schools.
To sum up, we don’t learn much from standardized accountability, and we have lost a great deal by giving it so much prominence.
But it has also been an opportunity. Recognizing our grading differences, we opted to create a common conception of achievement, our graduate profile, and department learning outcomes with rubrics. Our standards now align closely with the Common Core State Standards. Second, we created common performance tasks that measure these standards and formative assessments that scaffold to them. Third, we look together at student work. Fourth, we have begun to grade each other’s students on these common tasks.
We could publish the results of these performance tasks, and the public would have a good idea of what we’re good at and what we’re not. For example, our students effectively employ reading strategies to comprehend a text, but are often stymied by a lack of vocabulary or complex syntax. We’ve also learned most of our students can coherently develop a claim, citing the appropriate evidence to support it when choosing from a restricted universe of data. They aren’t as good when the universe of data is broadened. They are mediocre at analysis, counter-arguments, rebuttals, and evaluation of sources, though they have recently gotten better at evaluating sources as we have improved our instruction and formative assessments. A small percentage of our students do not show even basic competency in reading and writing.
That’s better information than we’ve ever received from standardized testing. What’s also started to happen is that teachers who use the same standards and rubrics, assign the same performance tasks, and grade each other’s work are finding their letter grades starting to align.
And, this approach has led to a lot of frank discussions. For example, why are grades different? Where we have looked, different conceptions of achievement and rigor seem most important. So we have to talk about it. The more we do, the more aligned we will become, and the more honest picture of achievement we can create. It has been fantastic professional development—done without external mandates. We have a long way to go, but we can understand the value of our efforts and see improvement in student work.
I would not advocate publishing individual teachers’ grades because it would cause the same problems as publishing individual teachers’ standardized-test results, but grades by subject, grade level, and demographic categories could be fair game externally. Internally, those breakdowns should stimulate hard conversations and necessary professional development. Of course, this proposal would have to be negotiated and modified locally to avoid the punishment/reward cycle of other accountability measures that force people to conform and tempt them to cheat. The goal is to spur the collaboration and conversation necessary for improvement.
Well, that’s your district, some might say. It’s got a unique collaborative culture and a better sense of achievement than most. You can’t do that across the nation.
Why not? With the common core, a definition of achievement exists. And teachers are more likely to respond to professional development and accountability more concretely connected to their daily work. They are more likely to improve.
That leads to the second argument. Even if standardized testing were not only desirable to give the public a picture of basic competencies, but also an efficient way to do so, the costs have been too great.
Many have previously made cogent arguments (unrealistic definitions of achievement, skewed instructional schemes, inequitable curricular offerings, inevitable corruption, perverted charter school missions, alienation, disempowerment, and embarrassment of educators, etc.) in this vein, but let’s think about a supposed example of success on this front—a school with the high test scores.
In general, such a school has a compliant or affluent population. Test scores are a point of pride. The school has a good reputation. But, when you go in and observe, the teaching and learning do not impress.
Never once have I looked at the test scores of this kind of school and thought, “How could I be more like them?” That’s because success represented just a score on a narrow test of a limited band of achievement (a test, by the way, with content that I was not even legally allowed to talk about), and I couldn’t see how looking at that score could help me in my day-to-day teaching. Even worse, I don’t think the teachers at such schools have learned much from their good scores. If anything, the scores have prevented them from becoming better.
So, to sum up, we don’t learn much from standardized testing, and we have lost a great deal by giving it so much prominence. The common core is at risk for failure, not because the standards are bad per se, but because with standardized accountability, as in so many partial reforms, we again won’t get a real picture of achievement, people will be disappointed, and the standards and testing will run their course.
Instead, why not just trust teachers and schools to report the progress of their students with the measures they have, and use internal and external local pressures to improve the measures and practices? It will avoid a plethora of social, emotional, and political costs. Any bureaucracy created can’t be more of a drag on the government or economy than the legion of consultants and think tanks today feeding off the trough of education. This proposal is more in line with what we know about the success of sustainable local organizations and what we know about the inflated rise and inevitable fall of mass reform movements.
A version of this article appeared in the July 10, 2014 edition of Education Week as We Don’t Need Standardized Tests. Here’s Why.