Jack and Julian continue their conversation this week, shifting focus to the role of testing in K-12 education.
Schneider: Last week we discussed the role of charter schools in the landscape of American public education. And one question that came up was about how to measure school performance.
I think most stakeholders are in agreement about the inadequacy of student standardized test scores. Even Michelle Rhee agreed with me on that point. But where do we go from here? And how should we track student learning if not by tests?
Heilig: Before we discuss this, I think it is important to contextualize the test debate. As the nursery for No Child Left Behind, Texas has never seen a test it didn’t like. But in the last few years things got out of hand and the state was requiring that students take 15 exams to graduate from high school. In the last legislative session, politicians finally got the message from parent organizations like TAMSA and reduced the number of tests required for graduation to five Pearson-produced STAAR exams. Notably, five Pearson exams to graduate is still more tests than the previous Pearson TAKS testing regime.
An interesting addendum to the high-stakes testing conversation was recently introduced into the public discourse by Jason Stanford in the Texas Observer. Stanford detailed the research and public legislative testimony of Dr. Walter Stroup, whose work suggests that about 70% of the eventual outcome for a student on any Pearson test in Texas was easily predictable before a student even sat to take an exam. There were apparent repercussions for Dr. Stroup at the University of Texas College of Education, which has received hundreds of thousands of dollars from Pearson and where some prominent faculty and leaders are involved with the corporation via consulting and publishing deals.
The tentacles of foundations and corporations into our college and universities and their influence on education “reform,” “research,” and the public discourse are an issue we must grapple with as a society. So before we move on to alternatives, I think it is important that we as a society ruminate on the power and quality dynamics that are readily apparent in the high-stakes testing debate.
Schneider: Companies like Pearson do a nice business producing tests. And you’d have to be pretty naïve to think that those tests do much to promote learning.
Yet at the same time, it’s worth noting that our national preoccupation with testing was not manufactured by Pearson. The SAT test is nearly a century old. And at the K-12 level, the New York state Regents exam is 150 years old. So while corporate lobbying and palm-greasing is disturbing in its own right, we can’t pin the obsession with testing on Pearson.
The inclination toward testing isn’t particularly nefarious, either. It’s problematic, certainly. But the idea driving testing is that, in order for there to be a system—across levels and across vastly different regions—there needs to be some way of determining how students are doing. Again, across thousands of different schools, staffed by millions of teachers—all of whom have their own idiosyncratic approaches to evaluation.
Do we have tests that serve as fair and comprehensive indicators of student progress? No. But that’s a different question.
Heilig: I actually think that tests are succeeding at what they were designed to do, which is to sort members of society. High-stakes exams were not designed to identify achievement gaps; they were designed to create the gaps.
There is a line of thinking that high-stakes exams will create equity in K-12 education. And politicians have been telling us this for decades. Yet here we are in 2014, and all students are not proficient as promised by NCLB a decade ago. Tests function as they were originally designed—to sort.
If you don’t pass high-stakes exams in Texas, or anywhere else, and can’t graduate from high school, you are most likely resigned to a lifetime in the bottom rungs of the labor market. High-stakes testing does not create equality; the tests function for the purpose they were originally created: stratification. Or in the words of NCLB: “Far Below Basic,” “Below Basic,” “Basic,” “Proficient,” and “Advanced.”
Schneider: Yes, sorting is a big part of testing. There’s no denying it. But that makes it seem like anyone who favors testing stands on the wrong side of the equity debate. And unfortunately it just isn’t that simple.
Remember, for instance, that the SAT test (which I happen to be very much against these days) was originally designed as a means of identifying talented students who weren’t attending elite schools. It helped non-dominant culture groups crack into what was essentially an old-boys network.
The Advanced Placement test, which also has some issues, spread because it gave low-income students a chance to show that they could complete the same kind of rigorous coursework as their more privileged peers. And unlike the SAT, the AP test was actually related to things students were learning in school—a step forward, in my mind.
And though NCLB’s theory of action was flawed (threaten districts with sanctions and they will try harder to educate kids), bipartisan support for the initial bill was driven by a belief that vulnerable students were being failed. The disaggregation of student achievement data by race, income, and special education status led to concrete action. Unintended consequences abounded, certainly. But most of them are the result of simplistic, rather than malicious, thinking.
So is the philosophy behind testing just about sorting? I’m not so convinced. I think most of our tests do a very poor job of measuring the things we care about. But I’m not sure that’s by design.
The opinions expressed in K-12 Schools: Beyond the Rhetoric are strictly those of the author(s) and do not reflect the opinions or endorsement of Editorial Projects in Education, or any of its publications.