A large proportion of schools in every state could be identified as “needing improvement” under the new federal education law and could eventually be subject to penalties.
cAlthough estimates vary, state officials who have begun crunching the numbers predict that as many as three-fourths of their schools could fail to make their annual growth targets, or “adequate yearly progress,” under the 2001 reauthorization of the Elementary and Secondary Education Act.
“It’s going to really be a nightmare for states,” predicted Cecil J. Picard, the state superintendent of education in Louisiana.
He estimates that up to 80 percent of the schools in his state could be targeted as needing improvement or corrective action in the first few years. “I don’t think that’s going to be acceptable, and I don’t think that’s going to fly politically.”
Scott Marion, the director of assessment and accountability for the Wyoming education department, says his state is in the same boat. “The bottom line is that we’re going to end up identifying, by any stretch of the imagination, incredibly more schools than we believe the resources are there to serve,” he said. “We’re talking anywhere from a low of 40 percent to a high of 80 percent of schools identified in one year alone.”
The 2001 rewrite of the ESEA, the major federal law in K-12 education, requires states to bring all students up to the “proficient” level on state tests in 12 years, or no later than the 2013-14 school year. The law spells out in complicated and specific detail how states should set the annual targets for schools and districts to meet that goal, with the initial target based on data from this school year.
Schools that fail to make that target, known as “adequate yearly progress,” for two consecutive years are identified as “needing improvement.” Those that receive federal Title I money must develop a plan to improve, and must offer students a choice of public schools to attend. They are also supposed to receive technical assistance from their districts.
After three years, such schools are also required to offer supplemental services, such as tutoring, to students from either public or private providers. And, after four years, they will be subject to progressively more severe corrective actions and could ultimately be shut down.
The requirements were the subject of intense congressional negotiations, in part because of concerns that draft language in both the House and the Senate versions of the bill would have resulted in too many schools being identified as failing. While the final legislation is an improvement over the original proposals, experts say, it has far from solved the problem.
“There are just so many holes in this thing when it comes to being operational that you could drive a truck through it,” asserted Richard K. Hill, the executive director of the Portsmouth, N.H.-based Center for Assessment, which provides advice to states on their accountability systems.
A study by the Congressional Research Service estimates that 17 percent to 64 percent of the schools serving grades 3- 8 in three states—Maryland, North Carolina, and Texas—would have failed to meet the AYP requirement for two consecutive years, if the law had been in effect during school years 1997-98 through 1999-2000. The highest estimates—up to 64 percent of schools—were for North Carolina, the only state that had data available for every student subgroup to be broken out under the law’s accountability system.
Wyoming officials analyzed test-score data for 4th graders for the 2000-01 school year in both reading and mathematics. The results indicate that up to half the state’s schools would fail to meet the AYP criteria in the baseline year
“A starting point that identifies at least half of our schools in year one will undoubtedly lead to almost all of our schools identified in need of improvement within a few years,” Superintendent Judy Catchpole and her colleagues wrote in a letter to federal officials. “Surely, this is not what the framers had intended when they wrote this law to concentrate resources and services on those children most in danger of being left behind.”
Similarly, analyses by the Center for Assessment indicate that more than 75 percent of the schools and districts in the states it has studied would be designated as failing to satisfy the AYP criteria in the first year.
“The feeling among most states is that the actual calculations of the AYP formula don’t yield the results that were intended,” said Scott M. Norton, the director of standards and assessments for the Louisiana education department. “How could there have been a law written with the intent of putting all or nearly all of the schools in corrective action or school improvement? It just doesn’t make any sense.”
In an interview last week, one congressional aide said lawmakers looked at data runs in drafting the requirements and tried to come up with some reasonable estimates. “We tried to be fair,” the aide said, “but we wanted to put the fire under states to help these schools do more.
“We did intend for this to be a wake-up call, no question. A lot of people in Congress felt for too long states had been cooking the books, so to speak. We didn’t want every district in every state to be in school improvement, but we expected a lot because we’re not doing a good job.
“The school improvement problem is not a problem,” added the congressional staff member, who spoke on condition of anonymity. “It’s a call to help those schools. ... If they don’t want them to fail, then let’s do more for them.”
‘A Statistical Impossibility’
Under the 1994 version of the ESEA, states were permitted to set their own definitions for adequate yearly progress and had no deadline for bringing all, or even some, of their students to proficiency on state tests. As a result, the proportion of Title I schools deemed as needing improvement varied widely from state to state, from 1 percent of schools in Texas to 76 percent of schools in Michigan in 1998-99.
Congressional leaders, who felt that states had abused the flexibility they had been given, decided to be much more specific when they rewrote the law last year. In addition to setting a 12-year deadline by which all students in a state must reach proficiency, they required that states set annual, measurable targets for reaching that goal.
For a school or district to make adequate yearly progress, it must reach the same annual target both for its total student population and for specific subgroups of students: students from poor families, students from major racial and ethnic minority groups, students with disabilities, and those with limited fluency in English.
“The fact that there’s only one goal and every subgroup has to meet the same goal is probably the major problem,” said Mr. Norton of Louisiana.
Because the U.S. Department of Education has yet to issue guidance or draft regulations about adequate yearly progress, it’s not clear precisely how states should set the baseline for measuring gains.
The law requires that states choose the higher of two bars for baseline proficiency. The first is the percent of students at the proficient level in the lowest-performing subgroup in the state in 2001-02. The second option is for states to rank schools by the percent of students proficient on state tests in 2001-02, and then sets the bar at the point at which one-fifth of all students are in schools with lower proficiency levels.
Experts predict that most states will end up using the latter, because that figure will almost always be higher than the proficiency level for the state’s lowest-performing subgroup, such as students with disabilities or Native American students.
As a result, they argue, a large number of schools in a state will already have scored below the target for overall school performance set in the baseline year, by definition. Once states add in schools where at least one subgroup also fails to make that target, the number of schools that fail to make adequate yearly progress starts snowballing. And when schools are then required to make annual gains in performance for every subgroup, experts say, the problem increases exponentially.
“Our biggest concern is this idea that you’re going to have to have even progress across all subgroups because that’s almost a statistical impossibility,” said Dennis W. Cheek, the director of research for the Rhode Island education department. “We found that if you held strictly to that, there’s virtually no school in the state over the past four years that would actually meet that kind of criteria.”
In North Carolina, said Gongshu Zhang, a research consultant who has run preliminary analyses for the state, some schools may have up to 10 subgroups of students. “So now, the law says we must disaggregate student data by 10 groups, and each group should be separately evaluated by reading and math. Then, if one group fails to make adequate yearly progress, the whole school will be identified as failing to make adequate yearly progress,” he said. “So this is a much, much higher standard than to just evaluate a school as a whole.”
Based on his calculations, Mr. Zhang said that, generally speaking, only about one-quarter of North Carolina’s K-8 schools fulfilled the AYP criteria from the 1998-99 to 2000-01 school years. “This already includes the safe-harbor provision,” he said.
The “safe harbor” provision permits schools to make their AYP targets, even if a particular subgroup fails to meet its goal, as long as the proportion of students in that subgroup who score below proficient has declined by at least 10 percent since the previous school year. Many state officials said that will still be a stretch for some subgroups, given their current proficiency levels, and that the safe harbor may be well out of reach.
In Wyoming, Mr. Marion speculated that “any school with a large enough contingent of special education students to be called a group will be put in improvement.”
Part of the problem is that school rankings are inherently unstable, with schools that improve one year tending to fall back the next, as student populations in a given grade shift or because of other natural volatility in test scores. The problem is even worse for small schools or for subgroups of students within schools, because the number of test-takers being considered is lower. Schools with more diverse populations, and hence, more subgroups, are particularly likely to be mislabeled.
David Figlio, a professor of economics at the University of Florida in Gainesville, analyzed five years’ worth of state test data from two Florida districts. Based on such year-to-year bounce, he concluded, “it may be nearly impossible for a school to experience persistent improvements across a wide variety of subgroups.”
Averaging a school’s test- score data over several years, as permitted under the ESEA, could help. For example, states could compare the percent of students in a school who scored at the proficient level in reading averaged over the 1998-99, 1999-2000, and 2000- 01 school years with the average over the 1999-2000, 2000-01, and 2001-02 school years.
Using such three-year rolling averages reduced the number of “unstable” schools (those that improved one year and fell back the next, or vice-versa) from about 57 percent to 33 percent in the Florida districts he studied, Mr. Figlio said.
Thomas J. Kane, a professor of policy studies and economics at the University of California, Los Angeles, suggests going one step further and weighting those averages by school size to account further for the fluctuations in test scores in small schools. But Mr. Figlio cautioned that, even with three-year rolling averages, “many schools—dozens in any large district—will have erratic patterns of performance that have little to do with changes in school quality.”
One way states could reduce the number of schools needing improvement is in their determination of how many students are necessary to yield “statistically reliable information” for subgroups. Under the law, that determination is up to each state. The tension, noted Brian Gong, the associate director of the Center for Assessment, is that while increasing that number even a little could increase reliability and greatly reduce the number of schools identified as failing to meet their targets, it also means that large numbers of schools would not be held accountable for subgroup performance at all.
Officials in states such as Massachusetts, North Carolina, and Rhode Island also are concerned about sending mixed signals about whether schools are succeeding or not, as they try to mesh the federal requirements with their own state accountability plans. State officials worry that the requirements in the law are so specific that it may be difficult to continue their existing accountability models, even if those models have been developed over many years.
‘Not Much to Negotiate’
State officials are also concerned about how to set 12-year timelines for making improvements when they are changing the tests on which those targets are based, or adding new tests, as required by the law.
Although the ESEA mandates that states set a baseline this school year, many states will not have all the requisite tests in place by then. And it’s not clear how they should adjust their AYP goals as new tests come on line.
Texas, for example, is putting in a new testing program in the 2002-03 school year and won’t have statewide results until that spring. “So we won’t have any data to even set a baseline, until we have the 2003 test data,” said Criss Cloudt, the associate commissioner for accountability in the Texas Education Agency, “and we won’t really see gain information from that testing program, of course, until the 2004 testing administration.
“It’s going to be an issue for us,” she continued, “trying to figure out how to address the requirements of the statute.”
Many people had hoped that the topic of adequate yearly progress would be considered during negotiations over draft rules on standards and assessments last month. But the Education Department decided to issue draft regulations on the AYP separately. (“Testing Rules Would Grant States Leeway,” March 6, 2002.)
“There’s not much to negotiate,” Susan B. Neuman, the assistant secretary for elementary and secondary education, said in an interview. “It’s in the law. It’s there. The fact of the matter is we’re not able to negotiate on that.”
Even so, state officials and others are continuing to press for as much flexibility as they can get. “I think what we’d like to see is for each state to be able to put forward a well-reasoned, data-grounded approach to what they want to do, and then negotiate with the federal government in terms of whether that’s acceptable or not,” said Mr. Cheek of Rhode Island.
“We’re not asking to be excused,” agreed Bob Harmon, the assistant superintendent for special programs for the Washington state education department. “We’re not saying this is garbage. Just help us find a way to make it work.”
Mr. Marion, the state testing director in Wyoming, noted that school-level accountability systems are a relatively recent phenomenon. “By shutting off all possible models besides this one, we’re not allowing ourselves to learn which ones can be the most effective as a country,” he said. “We don’t know enough to say that this is the model.”
At the same time, state officials acknowledged that their current projections are based on previous test-score data and do not reflect the new incentives that schools may have to improve student performance overall, or to close achievement gaps, under the revised ESEA. In North Carolina, for instance, test-score gains were more pronounced in the first few years after passage of the state’s education reform law in 1996 than from 1999 to 2001. If the state could regain that momentum, noted Mr. Zhang, far more schools would meet the AYP targets.
Moreover, because the federal Education Department has yet to issue any guidance or regulations on adequate yearly progress, some of the interpretations states have used in building their projections may be off base or may not take advantage of all the options available. The department hopes to have final regulations on adequate yearly progress in July. An aide on Capitol Hill said lawmakers hope the department “will continue to work with states to make sure they are using all of the flexibility they can under the law.”
“If there are individual problems with the requirements, I’m sure the department is going to work on making things work,” the aide said. “But in terms of the general thrust of [the law], I can’t believe they’re going to walk away from that. The law is pretty clear.”
Joseph F. Johnson Jr., the director of compensatory education programs for the U.S. Department of Education, suggested that states might as well get used to the new demands.
“The bottom line is that if you’re going to reach the goal of getting all schools to achieve very challenging results, schools and school districts are going to have to make some very dramatic changes in instruction,” he said. “So they may as well get about doing it.”
A version of this article appeared in the April 03, 2002 edition of Education Week as ‘Inadequate’ Yearly Gains Are Predicted