Only three school reform models make the grade in a new study.
Before this guide, the only way to judge a program was by its advertising, says one official.
Architects of school reform models may have reason to worry. On the heels of controversy over a federal list of suggested programs, a new independent study has concluded that just three of 24 popular models have strong evidence that they improve student achievement.
An Educators' Guide to Schoolwide Reform, a 141-page report from Washington, D.C.-based American Institutes for Research, found that only Direct Instruction, High Schools That Work, and Success for All made the grade. Commissioned by five education groups-including the National Education Association and the American Federation of Teachers-the report is the most comprehensive rating of school reform programs.
Such information is desperately needed, says Paul Houston, executive director of the American Association of School Administrators, another sponsor of the report. "Before this guide came along, about the only way educators could judge the worth of some of these programs was by the quality of the developers' advertising and the firmness of their handshakes. Now, superintendents, principals, and classroom teachers can sit down together and make reasonable decisions about which is best for their district's needs."
But some developers are questioning how AIR decided which studies to include as evidence of a program's effectiveness. Others maintain that they have more evidence of positive results than AIR gives them credit for. Henry Levin, a Stanford University economist whose Accelerated Schools program received only a "marginal" rating, described the study as "fairly amateurish."
"Basically, they discounted anything, as far as I can tell, that comes in and changes test scores over time for a particular school," he says. "And [any program] that said it had a comparison group was given a gold standard."
Such criticism echoes the recent hubbub over the federal list of suggested reform models. [See "Who's In, Who's Out," March.] That list-which included 17 programs-was intended to guide schools seeking some of the $150 million that's available as part of the Comprehensive School Reform Demonstration Program. But reformers who didn't make the cut contested Congress' selection process.
After the release of the AIR report, some developers welcomed the new scrutiny. More than anything, they say, the AIR study underscores the need for strong, third-party evaluations of schoolwide reform models. Similar studies are now completed or in the works.
"The fact is that the capacity to do this kind of research is very limited in this country," says Marc Tucker, a founder of America's Choice, one of the 24 models reviewed by AIR. "I believe that it's very important for the federal government to put a fair amount of money on the table to make this kind of research possible."
Ellen Condliffe Lagemann, president of the National Academy of Education, a group of education researchers and scholars, agrees. "It's amazing how little evaluation there is," she says. "Since the early 20th century, the people who have peddled the educational reform strategies that we all hear about tend to be successful because they're the best entrepreneurs. It doesn't necessarily have to do with any research capability."
The AIR's consumer-oriented guide rates 24 whole-school reform models according to whether they improve achievement in such measurable ways as higher test scores and attendance rates. It also evaluates the assistance provided by the developers to schools that adopt their strategies and compares the first-year costs of such programs. "We wanted to have a document that really, critically evaluated the evidence base underpinning these programs," says Marcella Dianda, a senior program associate at the NEA, which helped underwrite the $90,000 study. "We felt that our members really wanted that. They wanted us to get to the bottom line."
The evaluators used a multistep process to rate whether the programs had evidence that they raised student achievement, according to Rebecca Herman, the project director. First, AIR gathered almost any document that reported student outcomes, including articles in scholarly journals, unpublished case studies and reports, and changes in raw test scores reported by the developers. More than 130 studies were then reviewed and rated for their methodological rigor in 10 categories, based on such criteria as the quality and objectivity of the measurement instruments used, the period of time over which the data were collected, the use of comparison or control groups, and the number of students and schools included.
Studies that met AIR's criteria for rigor were used to judge whether a program was effective in raising student achievement. For example, a number of developers submitted changes in state or local test scores as evidence that their programs were working. But, says Herman, "we really didn't consider test scores alone, without some sort of context, because there are a lot of things that can explain changes in test scores."
In its final analysis, the study gave a "strong" rating to the programs with the most conclusive supporting research, notably four or more studies that used rigorous methodology and found improved achievement. The gains had to be statistically significant in at least three of those studies. A "promising" rating went to models with three or more rigorous studies that showed some evidence of success. A "marginal" rating went to reform models that had fewer rigorous studies with positive findings or a higher proportion of studies showing negative or no effects. A "mixed or weak" label was assigned to programs with study findings that were ambiguous or negative. And AIR gave a "no research" rating to programs for which there were no methodologically rigorous studies. Eight programs received the "no research" rating-not surprising, according to Herman, given the newness of many of the models.
"It takes a good three years to implement a reform model across a school, and another two years to come up with a decent study," she says. "What we're looking at is the first wave of research, and we're hoping for an ocean to follow it."
The study comes as districts around the country seek proven, reliable solutions to the problem of low-performing schools. But as they spend greater amounts of tax dollars on the various reform models, questions remain about how well the programs work. About 8,300 schools nationwide-roughly 10 percent-were using one of the 24 designs rated in the study as of October 30, the report says. Yet it notes that "most of the prose describing these approaches remains uncomfortably silent about their effectiveness."
Copies of the report, An Educators' Guide to Schoolwide Reform, are available from the sponsoring organizations for $15.95 each for nonmembers and $12.95 each for members. The full text of the report is also available on the World Wide Web at www.aasa.org/Reform/index.htm.
The Research section is underwritten by a grant from the Spencer Foundation .
Vol. 10, Issue 7, Pages 19-20Published in Print: April 1, 1999, as The A-List