It’s time for April showers and the perennial tempest of success and despair that is the release of the Nation’s Report Card in mathematics and reading. With the scores for America’s 4th and 8th graders not out until Tuesday, a variety of education watchers are already trying to decide what to think of the results.
The National Assessment of Educational Progress, administered every two years to a representative sample of students in every state, is considered the gold standard in U.S. assessments and a finger on the pulse of the nation’s schools. And at a time when some states are still in the process of deciding on how they will test under the Every Student Succeeds Act, it may be the only way to consistently measure student progress across all states.
“Every two years we go through this,” said Andrew Ho, a Harvard University education professor and a member of the National Assessment Governing Board, which supervises the NAEP. “It’s a periodic, frustrating, and occasionally galvanizing reminder of how difficult it is to make large-scale improvements in education.”
Looking for Trends
The last results released in 2015 gave educators and policymakers alike heartburn by showing statistically significant declines in both math and reading for the first time in 20 years. And officials may be primed for even more concern about the results Tuesday, following recent lackluster results for American students in international reading and math tests.
There are already a few signs of concern, particularly around the NAEP’s move in 2017 from a paper-and-pencil format to one given on tablets or digital devices. The results, originally scheduled for release in October, were delayed while researchers at the National Center for Education Statistics, which administers NAEP, conducted two “bridge” studies to link the old and new versions of the tests. That involved giving samples of students both print and digital versions of the NAEP in 2015 and 2017 and comparing the results.
Moving from one format to another can change what’s being measured, and some prior studies have found students scored lower on average on digital versions than the paper ones. Concerns about these so-called “mode effects” have already prompted questions from some states. Both the Council of Chief State School Officers and Louisiana schools chief John White have asked for more information about how differences among states could affect the tests.
“I’m somewhat less concerned [about mode effects] because NCES has a lot of control. At the state level you have more variation—this district used Chromebooks, that district used iPads ...,” said Daniel Koretz, the author of the book The Testing Charade: Pretending to Make Schools Better. “I’m not hitting the panic button, but I’m going to be looking for it.”
But there are some cautions that come up in every cycle. Here are three things we likely won’t be able to learn from the results on Tuesday:
1. How did my student do on the test?
Unlike state accountability assessments, NAEP does not test every student in a given state or district. In fact, the NAEP only takes about an hour to 90 minutes to complete because each student only takes a portion of the test. That means NAEP does not measure individual students’ growth, and it’s not possible to compare, say, how every 8th grader performed on a given question.
Moreover, the NAEP is geared to be challenging, so it offers more “difficult” questions than easier ones at each grade level. That makes it different from international tests like the Program for International Student Assessment or the Progress in International Reading Literacy Study which ask more questions at lower ability levels. That’s important, because it means those international tests can show more nuance in performance among the lowest-achieving students, while NAEP can show more details about high-performing students. For example, that difference in test difficulty highlighted widening achievement gaps in reading when PIRLS released its most recent results back in December.
2. Are NAEP results proof that the Common Core was effective—or that it failed?
NAEP takes a snapshot of student achievement. While is is possible to use NAEP to look at the relationships between a variety of factors and student achievement, it is very, very difficult to draw conclusions about what caused particular changes in test scores. Yet the results are routinely pointed to as evidence that federal, state, or local policies succeeded or failed—a practice Mathematica’s Steve Glazerman has dubbed “misNAEPery.”
NAEP’s results, both overall and for particular student groups, do provide an important benchmark against which states can compare the performance and achievement gaps of their own assessments. But it will likely be a few months before they will be able to dig into those details, because NCES has not yet released its updated study mapping state proficiency standards to NAEP’s.
3. My state fell in comparison to the next state over in math. Should we overhaul our curriculum to look like that state’s program?
Focusing on state rankings, of either proficiency levels or achievement gaps can lead states and districts to make the wrong comparisons. Experts have raised similar concerns about country comparisons on international tests.
“If you want to conclude that New England is ahead of the Southeast, you are probably on very safe ground. But when you get to comparing the state ranked 4th and the state ranked 6th, that’s not safe,” said Koretz, because two close states may not have scores that are statistically different, particularly for individual content areas within a subject or grade.
“NAEP is a very, very good test, but it is just one test,” he added. “Different measures give us different answers.”
The most interesting information may come not from state achievement rankings, but from more detailed data on gaps between student groups and background information collected about school, teacher, and student characteristics and practices that can add context to the results. Recent contextual analyses have revealed significant differences among states and districts based on school and residential segregation, for example.
“That’s the promise I see in the explosion of secondary research around NAEP,” Ho said. “After the misNAEPery and the speculation dies down, then we can get to work and start disentangling the causes and correlates ... and start to reliably talk about the progress states and districts are making, and not just leader board status.”
Want more background on the Nation’s Report Card? Here’s a primer on how the NAEP fits into the larger assessment landscape. And here’s a detailed look at some of the common misuses of NAEP data.
Image Source: National Center for Education Statistics
A version of this news article first appeared in the Inside School Research blog.