Assessment Commentary

Does High-Stakes Testing Hurt Students?

By Laurence Steinberg — February 05, 2003 6 min read
Does high stakes testing hurt students? Read the early evidence with caution.

According to a recent front-page story in The New York Times, high-stakes testing does more harm than good, increasing the proportion of students who drop out of school, decreasing the proportion who graduate, and diminishing students’ performance on standardized tests of achievement. But before President Bush and his education team abandon their efforts to hold students, teachers, and schools accountable, they should read the actual report on which the news story was based.

The study contained in the report, conducted by Arizona State University researchers Audrey Amrein and David Berliner and paid for by several affiliates of the National Education Association, analyzed over time trends in student achievement and school completion in states that have implemented high- stakes testing and compared these trends to national averages for the same indicators. If student performance following the introduction of the test appeared to decline relative to the national trend over the same time period, the authors concluded that the testing had a negative effect. (“Reports Find Fault With High-Stakes Testing,” Jan. 8, 2003.)

Based on their analyses, the authors compiled a score card that tallied the number of states in which testing had a negative effect, the number of states in which the effect was positive, and the number in which the impact of testing was mixed or unclear. Because the number of states in the negative column exceeded the number in the positive column, they concluded that, on average, high-stakes testing is bad.

The so-called declines the authors used to categorize states into the winning or losing columns are often so small as to be meaningless, however. Consider, for example, the “strong” evidence that the implementation of high-stakes testing in New York had adversely affected school completion. From the sound of it, one would think that thousands of students had dropped out as a result of the testing. In fact, during the period after the 1994 introduction of graduation exams in New York, the state’s dropout rate didn’t increase at all—it remained flat, whereas during the same period, the national dropout rate declined by 1 percent. And what was the staggering drop in New York’s graduation rate following the introduction of the test? The rate declined by three-tenths of 1 percent during a time when graduation rates remained unchanged nationally. Nevertheless, on the basis of this “strong” evidence, New York ends up in the column of states whose students were ostensibly harmed by testing.

By the time one gets to the authors’ summary table, though, much less the hyperbolic press release that trumpeted the report, the actual sizes of the effects that are under discussion are long forgotten. In other words, the list of states where students were allegedly harmed by testing could include states whose indicators barely changed as well as those where they changed a great deal. In fact, there were many of the former and few of the latter. Indeed, of all the states whose graduation rates declined following the implementation of testing, none saw a decline that differed from the national average by more than 1.6 percent. Moreover, the average relative decline in graduation rates among states whose rates fell was smaller than the average relative increase in graduation rates among states whose rates rose. The data showing changes in achievement-test scores are equally meaningless, with the putative effects of testing usually smaller than the margins of error in the tests.

When a trend being analyzed is brief, it is easy to be fooled into thinking it is meaningful.

Social scientists generally are interested not only in the size of an effect, but in whether the result is statistically significant. In fact, nowhere do the authors of this report say whether the effects they have alleged to uncover are statistically significant, most likely because they are not. (I corresponded with Ms. Amrein and learned that no significance-testing had been done.) This is important, because findings that look impressive are frequently chance occurrences. When a trend being analyzed is brief, it is easy to be fooled into thinking it is meaningful. Suppose, for example, a coin I flipped four times in a row landed on heads each time. Would you be willing to believe that I had discovered a magic coin that always turned up heads, or would you want to see a few more flips? In the analyses presented in this report, not only are the effects often minuscule, few of the trends the authors describe are long enough to draw any reliable conclusions about the impact of testing on anything.

It is conceivable, of course, that implementing high-stakes testing could influence dropout or graduation rates, although the authors of this report, as well as those who funded it, will have a hard time explaining why, in several states, the trend lines point to declining dropout rates and rising graduation rates after the introduction of testing. (I don’t place much credence in these results, either, because they, too, are unlikely to be statistically significant.) But the authors’ contention that the implementation of high-stakes testing depressed students’ performance on tests like the SAT or ACT is just plain silly. Performance on these tests is strongly linked to students’ socioeconomic status and is marginally, if at all, affected by what takes place in the classroom.

And then, of course, there is what social scientists call the third-variable problem. During the period following the implementation of testing, plenty of other factors change as well, and many of these factors could conceivably influence dropout and graduation rates as well as achievement-test scores. Comparing each state’s trend to the national trend does not solve this problem, because factors that may have changed in a particular state may not have changed in the same way across the nation.

It is conceivable, of course, that implementing high-stakes testing could influence dropout or graduation rates.

One potentially important factor, for example, is the size of the state’s Hispanic population, because Hispanic youngsters drop out of school at a much higher rate than do other students. The two states where the relative increase in the dropout rate following the introduction of testing appears to be large enough to be worrisome—Nevada and New Mexico—are states with high and rapidly growing Latino populations. In fact, five of the eight states that showed a relative increase in their dropout rates following the introduction of testing are states with large Latino populations that grew dramatically during the time frame examined in the report (the other three are New York, Texas, and Florida). In all likelihood, this change in demographics, and not the implementation of testing, led to higher rates of dropping out and lower test scores.

A sensible reading of the evidence to date suggests that high-stakes testing so far has had neither the dramatic beneficial effects hoped for by its proponents nor the catastrophic ones feared by its detractors. But even this conclusion is not cautious enough. It will take many years, perhaps even decades, to assess the impact of such a dramatic change in educational policy and practice on student achievement.

Does high-stakes testing encourage teaching to the test? Probably. But this is not a problem if the tests that teachers are teaching to are measuring things we want our students to learn. As long as this is the case, there is nothing wrong with ensuring that students have mastered what we expect them to know before promoting them to the next grade level. How can anyone oppose that?

Laurence Steinberg is the distinguished university professor of psychology at Temple University in Philadelphia.

Related Tags:


School & District Management Webinar What's Ahead for Hybrid Learning: Putting Best Practices in Motion
It’s safe to say hybrid learning—a mix of in-person and remote instruction that evolved quickly during the pandemic—is probably here to stay in K-12 education to some extent. That is the case even though increasing
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Mathematics Webinar
Building Equitable Systems: Moving Math From Gatekeeper to Opportunity Gateway
The importance of disrupting traditional American math practices and adopting high-quality math curriculum continues to be essential for changing the trajectory of historically under-resourced schools. Building systems around high-quality math curriculum also is necessary to
Content provided by Partnership for L.A. Schools
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Student Well-Being Webinar
Measuring & Supporting Student Well-Being: A Researcher and District Leader Roundtable
Students’ social-emotional well-being matters. The positive and negative emotions students feel are essential characteristics of their psychology, indicators of their well-being, and mediators of their success in school and life. Supportive relationships with peers, school
Content provided by Panorama Education

EdWeek Top School Jobs

Teacher Jobs
Search over ten thousand teaching jobs nationwide — elementary, middle, high school and more.
View Jobs
Principal Jobs
Find hundreds of jobs for principals, assistant principals, and other school leadership roles.
View Jobs
Administrator Jobs
Over a thousand district-level jobs: superintendents, directors, more.
View Jobs
Support Staff Jobs
Search thousands of jobs, from paraprofessionals to counselors and more.
View Jobs

Read Next

Assessment Opinion AP Exams Can't Be Business as Usual This Year
The College Board seems unconcerned with the collateral damage of its pandemic approach, writes an assistant superintendent of curriculum and instruction.
Pete Bavis
5 min read
Illustration of large boat in turbulent waters with other smaller boats falling into the abyss.
iStock/Getty Images Plus
Assessment Federal Lawmakers Urge Miguel Cardona to Let States Cancel Tests, Highlighting Discord
A letter from Democratic members to the new education secretary calls for an end to the "flawed" system of annual standardized exams.
3 min read
Jamaal Bowman speaks to reporters after voting at a polling station inside Yonkers Middle/High School in Yonkers, N.Y. on June 23, 2020.
Jamaal Bowman speaks to reporters after voting at a polling station inside Yonkers Middle/High School in Yonkers, N.Y. on June 23, 2020.
John Minchillo/AP
Assessment How Two Years of Pandemic Disruption Could Shake Up the Debate Over Standardized Testing
Moves to opt out of state tests and change how they're given threaten to reignite fights over high-stakes assessments.
9 min read
Image of a student at a desk.
Assessment A Plan for Standardized Test Scores During the Pandemic Has Gotten States' Attention
A testing expert says his idea would provide helpful data with key context, but said other measures about student well-being are crucial.
7 min read
HS class 1257213326