The No Child Left Behind Act of 2001 set as a national goal that all children would be proficient in reading and math by 2014. But perhaps reflecting bruising battles over President Bill Clinton’s proposed “voluntary national test” in the late 1990s, the law left both the design of the tests and the setting of “cut scores” to the states. The dubious outcome of this choice is well documented: States set their cut scores so low that large numbers of students were judged proficient even though they lacked basic skills.
When the National Center for Education Statistics mapped these state proficiency standards onto the National Assessment of Educational Progress scale, it found that in 2007, seven states had set their 4th grade math proficiency standards below NAEP’s basic level. Only Massachusetts had set its proficiency standard equal to NAEP’s. Using different data, the Thomas B. Fordham Institute found similar results, calling its findings “The Proficiency Illusion.”
NAEP’s recent Trial Urban District Assessment, or TUDA, covering math in 18 cities presents another window into this illusion. Given the concentration of low-performing students in large urban districts and the need to radically transform their education, the stakes in accurately measuring student skills and progress in these school systems are profound. The following overview of findings from the TUDA study focuses on the 4th grade math results, but those for the 8th grade are virtually identical.
If we looked only at state test results, we could conclude that most students in these large urban districts are doing reasonably well. In six of the districts, more than three-quarters of pupils are proficient, and in four others more than two-thirds are.
But the TUDA study describes a different world. Across these 18 districts, only 24 percent of students meet NAEP’s proficiency standard, far fewer than the 67 percent average across state assessments.
Outrage is often a powerful prod to reform, and state tests may be minimizing that potent force by presenting an unrealistically rosy picture of student proficiency."
Another way of thinking about this disconnect is in terms of the gap between percent proficient on state assessments and on NAEP. The gaps range from nearly 70 points in Baltimore and Detroit, to about 30 points in San Diego, Jefferson County, Ky., and the District of Columbia. Boston is the only school system in the study in which more students scored proficient on NAEP than on the state test.
This suggests that even the Fordham Institute’s provocative title understates reality. When it comes to these urban districts, the “proficiency illusion” is more a “proficiency delusion.”
One of America’s biggest challenges is to make progress on the number of students who are proficient in math (and, of course, in reading). Indeed, the No Child Left Behind law’s focus on “adequate yearly progress,” or AYP, was perhaps its most challenging and consequential dimension. Do NAEP and state measures paint a similar picture in terms of progress?
To help determine that, we can calculate changes in the percentage of students proficient on both NAEP and the state assessments for the 11 districts that took part in TUDA in 2007 as well as 2009. In nine of those districts, at least the direction of the change was the same on both assessments (Atlanta and Austin, Texas, are the exceptions). In terms of the magnitude of change, seven districts recorded more growth on state assessments than on NAEP, and in Cleveland, the decline in the percent proficient was smaller on the state test than on NAEP.
As a rough indication of how state assessments and NAEP differ in measuring progress, the average gain in percent proficient on state assessments across these 11 districts was 10 percent over that two-year period. The average gain on NAEP was only 4 percent. If we rank the districts based on the size of their gains on each assessment, the correlation is only 0.20 and not statistically significant.
In short, we’re using different yardsticks to measure progress—and finding only limited agreement between them.
Clearly, setting a national goal of having all students proficient by 2014, and yet leaving states to create their own tests and set their own cut scores, has produced a mess. Of course, state assessments and NAEP do not have identical purposes, but can these differences account for the wide variation in estimates of percent proficient? NAEP’s proficiency standards are often viewed as “aspirational,” setting a high bar, yet international comparisons show our students lagging behind major competitors. So what may be “aspirational” for our nation is about basic for many others.
We’re using different yardsticks to measure progress—and finding only limited agreement between them."
Increased scrutiny from the federal government and requirements to make test results more widely available than ever before have led states to this defining-down of proficiency. But it is a move that will thwart efforts to improve their schools.
Outrage is often a powerful prod to reform, and state tests may be minimizing that potent force by presenting an unrealistically rosy picture of student proficiency. When parents in, say, Detroit are told that more than 70 percent of their 4th graders are proficient in math, they will feel quite different about their schools than when told that just 3 percent have cleared the NAEP proficiency bar. Indeed, announcement of the TUDA results set off a barrage of national and local press reports calling Detroit the worst school district in the nation and demanding action. Setting a low bar for state-determined proficiency may weaken the push for needed changes.
These results also highlight the stakes involved in the Common Core State Standards Initiative being led by the National Governors Association and the Council of Chief State School Officers. No matter how hard the standards-setting part of this project has been, it will be easier than what remains to be done: turning these common standards into actual assessments, and then defining proficiency cut scores. The NGA and the CCSSO are still more than a little vague about how this will happen. And the U.S. Department of Education, which is putting $350 million into developing common assessments, has also left the path forward undefined.
Clearly, the whole common-standards effort is fraught with challenges. And one of the most fundamental, as my analysis shows, will be determining who sets the cut scores that define proficiency. With tensions between setting a common cut score and deferring to the states rising, political pressure to allow states to do it will be intense. But the disparities between state and TUDA results show just how consequential the stakes are in this decision.
One possible solution would be what is termed “partial pre-emption,” in which the federal government sets a uniform standard that states are free to exceed. This has been widely used in federal environmental laws and in setting safety standards through the Occupational Safety and Health Act, and is built into the Individuals with Disabilities Education Act. Not surprisingly, states are often unhappy with these pre-emptions, and push-back is common. Nonetheless, under the No Child Left Behind Act, we have already tried a system in which states were free to do as they pleased in setting proficiency standards. It didn’t work. Setting a national cut score, while allowing more aggressive states such as Massachusetts the freedom to exceed it, would be a big step forward.
This would require legislative action. The long-overdue reauthorization of the Elementary and Secondary Education Act, of which NCLB is the latest version, is the most appropriate place to revisit the current chaotic definition of proficiency. We can mend its flaws without weakening the national commitment to improving the education of all our students.
A version of this article appeared in the March 03, 2010 edition of Education Week as The Proficiency Delusion