Reports Find Fault With High-Stakes Testing
A pair of new studies suggests that efforts in more than half the states to tie major consequences to student test scores are producing few translatable academic gains and, in some cases, may even be pushing struggling students off the traditional path to a high school diploma.
"We need to sit back and start thinking about whether the very few positive effects we are seeing outweigh the many negative effects we are starting to find," said Audrey L. Amrein, a researcher at Arizona State University in Tempe and the lead author for both reports. "It would be a real shame if we continued down this same path without deliberating more on that."
The studies, which Ms. Amrein produced with ASU education professor David C. Berliner, are described as the largest so far to examine the merits and drawbacks of states' high-stakes testing programs.
Expanding on a study the authors published last spring in the electronic journal Educational Policy Analysis Archives, the reports were paid for by the Great Lakes Center for Educational Research and Practice, a Midwestern group of six teachers' union affiliates that have been critical of such testing policies. The researchers hope to update the reports annually. ("Study Argues Test Policies Don't Work," April 24, 2002.)
Keeping an eye on such programs is especially critical now, many observers say, because they have become a centerpiece of federal education policy. The "No Child Left Behind" Act of 2001, the most recent update of the federal Elementary and Secondary Education Act, requires that, in coming years, all states adopt high-stakes testing for schools that receive aid through the federal Title I program for disadvantaged students.
And the Arizona researchers' reports are already stoking debate among academics and policymakers who argue, on one side, that the studies prove that the Bush administration's embrace of make-or-break tests is wrongheaded, and, on the other, that the findings may be biased against rigorous testing policies.
Ups and Downs
For the two studies, Ms. Amrein and Mr. Berliner collected data on the 28 states that over the past two decades have begun using state tests to determine whether students graduate or are promoted to the next grade, which teachers win bonuses, or what schools are taken over by states or districts. Those kinds of consequences earn tests the "high- stakes" label.
In the first study, the authors were interested in whether gains most of those states have noted on their tests would transfer to other, independent exams taken by wider groups of students.
They found that, after adopting their new testing policies, 19 of the 28 states saw decreases in 4th grade mathematics scores on the federally sponsored National Assessment of Educational Progress when compared with the national average. On the 8th grade NAEP math tests, 18 states gained against the national norm.
States were more evenly split on the NAEP 4th grade reading test, with 14 states showing gains compared with national trends.
On the SAT and ACT college- entrance tests, twice as many of the states slipped relative to the national average as gained.
Likewise, trends in Advanced Placement tests were worse than the national average in 16 of the 28 states.
"The data presented in this study suggests that after the implementation of high-stakes tests, nothing much happens," the report concludes. "Students are learning the content of the state-administered tests and perhaps little else," the authors write.
For the second study, the researchers focused on 16 of the 18 states that have made passing such tests a requirement for high school graduation. In most of those states, they found that dropout rates increased, graduation rates declined, and the rates at which younger people took General Educational Development exams went up after the policies took effect. Based on those figures, as well as anecdotal evidence from the states, the authors contend that schools might be forcing out students who could drag down aggregate test scores.
"In my mind, the take-home message of these reports is that high- stakes accountability is not a sure-fire method of improving student achievement," said Gregory J. Camilli, a professor of educational measurement and statistics at Rutgers University in New Brunswick, N.J., who reviewed the reports.
Chester E. Finn, Jr., the president of the Washington-based Thomas B. Fordham Foundation and a former assistant secretary of education in the Reagan administration, disagreed. He is skeptical of the findings, he said, in part because of Mr. Berliner's previously vocal opposition to high- stakes testing.
"Moreover," Mr. Finn added, "that study did a weak job—in part because it's impossible to do a good job—of controlling for the zillion other policy changes under way in those states during the period they were seeking to gauge the effects of high-stakes testing."
Mr. Finn and others have also noted that the college-entrance exams in the study draw from a smaller subset of students, those who harbor college ambitions.
"That is a weakness of the study," Ms. Amrein acknowledged, but she said the data the researchers used were, nonetheless, the best available.
Daniel M. Koretz, a senior social scientist at the RAND Corp., a think tank based in Santa Monica, Calif., and an education professor at Harvard University, said the studies also could benefit from some more fine-grained analyses.
States that ratcheted up the stakes in their testing programs in the late 1970s and early 1980s, for example, tended to use minimum-competency tests rather than the more rigorous tests that characterize high-stakes efforts now, he pointed out. But the ASU research does not differentiate between them.
Studies up until now, which have focused on smaller numbers of states, have differed over whether high-stakes tests lead to real academic achievement gains for all students. Scholars said the new studies would not end that debate.
"Basically," Mr. Koretz said, "I just don't think we know enough yet about the broad sweep of the impacts from all of these tests."
Vol. 22, Issue 16, Page 5