The duel is on.
First, researchers from Arizona State University made headlines with a report suggesting that efforts in more than half the states to tie serious consequences to student test scores were producing few transferable academic gains. Worse, they said in their December report, the policies may even be pushing academically weak students off the traditional path to a high school diploma.
Now, in a new report, a pair of California researchers contends the Arizona team may have gotten it at least partly wrong.
In their study, which has been in the works for two years, researchers Martin Carnoy and Susanna Loeb of Stanford University look at some of the same, or similar, testing and student-enrollment data. They conclude that state programs that put strong pressure on students and schools to raise test scores may be more helpful than harmful.
Throughout the late 1990s, the new study contends, scores on federal mathematics tests improved more in states with tougher accountability programs than they did in states without them. And the researchers could find little evidence that more students were repeating a grade, or failing to graduate from high school, as a result.
“There’s lots of reasons to be against tests,” said Mr. Carnoy, the report’s lead author. “But the idea of being accountable in some way and actually putting pressure on schools to improve may not be such a bad thing.”
The differing studies on the effects of the high-stakes testing movement are sure to ratchet up the debate over such programs nationwide.
A total of 28 states now have laws that call for using state tests to determine whether students graduate or are promoted to the next grade, teachers win bonuses, or schools are taken over by the state.
More are expected to follow suit as a result of the “No Child Left Behind” Act of 2001, the latest update of the federal Elementary and Secondary Education Act. It says that, in coming years, all states must adopt such high-stakes testing for schools that receive aid through the federal Title I program for disadvantaged students.
For their part, officials in most states with high-stakes testing programs already say that student test scores are improving on their exams. Skeptics, however, say that’s because teachers are being forced to “teach to the test” through drills and memorization—not because students are acquiring deep understanding of the subjects.
To find out how real the gains are and whether they translate to other kinds of tests, both research teams drew on data from the National Assessment of Educational Progress, a congressionally mandated test given to nationally representative samples of students in most states.
Experts said the Stanford study, slated to be published next month in the journal Educational Evaluation and Policy Analysis, takes a more finely grained cut at the question than the Arizona State study did.
For instance, the California researchers rate all 50 states on the strength of their accountability efforts. States were given a rating of one, for example, if they tested students in elementary and middle school, but did nothing with the results other than report them to the state.
The highest rating—a five—went to states that reward or punish elementary and middle schools on the basis of student test scores and require high school students to pass a minimum-competency test to graduate.
The stronger a state’s accountability program, Mr. Carnoy and Ms. Loeb found, the greater the gains its students made on the 8th grade NAEP test in math between 1996 and 2000—particularly for students scoring at the “proficient” level.
Scores for 4th graders in states with tough high-stakes measures also improved more than those in other states, though not by as much as the 8th grade scores rose.
For both grades, according to the authors, black and Hispanic students in high-accountability states tended to make greater improvements than white students did.
“That’s important that minorities are making significant gains with these tests, because we’re supposed to be lifting the bottom with these tests,” said Mr. Carnoy, who is an economics professor at Stanford in California.
The researchers also adjusted the data to account for differences between states in the percentages of special education or non-English-speaking students who were excluded from the tests, variations in increases in state spending for education, and other factors.
Their findings are not altogether incompatible with the data in the Arizona study.
Stakes and Dropouts
In that study, researchers Audrey L. Amrein and David C. Berliner describe a similar trend when they note that 8th grade math scores in 19 of the 28 states they examined improved at rates higher than the national average on the same federal tests. (Jan. 8, 2003.)
The Arizona researchers deemed high-stakes programs to have little impact in a state overall, however, if scores improved against the national average in 8th grade but not in 4th grade, or vice versa. That is where they part company with the Stanford team.
“That’s not scientific,” contended Eric A. Hanushek, a senior fellow at Stanford’s Hoover Institution, a public-policy think tank, referring to the Arizona study. “They just add up how many went up and how many went down.”
He and some other experts also fault the Amrein-Berliner study for focusing its analysis only on the 28 states the researchers labeled “high stakes.”
“Why would you compare states with strong accountability to everybody?” said John H. Bishop, an economics professor at Cornell University, in Ithaca, N.Y. “The natural thing to do would be to compare the states that had accountability systems to ones that didn’t.”
Mr. Hanushek and Mr. Bishop, who are economists, also have found improvements on NAEP scores for states with strong accountability programs.
In looking for relationships between the strength of a state’s accountability program and any changes in the number of students repeating a grade or failing to make it to the senior year of high school, Mr. Carnoy and Ms. Loeb compared enrollment figures for 8th and 9th grades with enrollments three or four years later in 12th grade.
“It’s become a kind of mantra with high school tests, that what schools do is basically keep kids back so they don’t have to take these tests, and we find no evidence of this,” Mr. Carnoy said. “On the other hand, if you’re a fanatic of these tests, it’s also not good news that more kids aren’t staying in high school because of these tests.”
The Arizona study, in comparison, zeroed in on 16 of the 18 states with high school graduation tests. In those states, it found, dropout rates increased, graduation rates declined, and the rates at which younger people took General Educational Development exams went up after the new policies took effect.
For her part, Ms. Amrein argued that the study she conducted with Mr. Berliner may be more comprehensive in scope than the Stanford research because it includes such data, as well as scores on NAEP reading tests, the SAT and ACT college-entrance exams, and the Advanced Placement tests used to determine whether students get college credit for coursework in high school.
“So as far as I’m concerned, our conclusions stand regarding the effectiveness, or more specifically, the ineffectiveness, of high school exit exams in increasing academic achievement,” Ms. Amrein said.
She also noted that the Carnoy-Loeb report mistakenly includes Arizona on its list of states with high-stakes exit exams; the state’s requirements do not kick in for students until 2006.
For his part, Mr. Berliner, who is an education professor at Arizona State, in Tempe, said it’s no surprise the studies should arrive at different conclusions.
“Different methods yield different results,” he said. “All this should do is get more researchers involved so that we get more data. It wouldn’t surprise me if we find high-stakes testing has positive results in some states and negative results in others.”
Margaret E. Goertz, who is a co-director of the Center for Policy Research in Education, a federally supported research center at the University of Pennsylvania’s graduate school of education, concurred.
“The more people we can get in doing these kinds of analyses, the better,” she said. “People need to start talking about what are appropriate methodologies for doing this.”
She and other experts pointed out that studying such programs is hard, though, partly because state policies and programs are continually changing, and partly because every state’s take on accountability is slightly different.
“I don’t think we’ll ever have the definitive answer that high-stakes accountability, per se, is good or bad,” Ms. Goertz said.