Opinion
Assessment Opinion

Does High-Stakes Testing Hurt Students?

By Laurence Steinberg — February 05, 2003 6 min read
  • Save to favorites
  • Print
Does high stakes testing hurt students? Read the early evidence with caution.

According to a recent front-page story in The New York Times, high-stakes testing does more harm than good, increasing the proportion of students who drop out of school, decreasing the proportion who graduate, and diminishing students’ performance on standardized tests of achievement. But before President Bush and his education team abandon their efforts to hold students, teachers, and schools accountable, they should read the actual report on which the news story was based.

The study contained in the report, conducted by Arizona State University researchers Audrey Amrein and David Berliner and paid for by several affiliates of the National Education Association, analyzed over time trends in student achievement and school completion in states that have implemented high- stakes testing and compared these trends to national averages for the same indicators. If student performance following the introduction of the test appeared to decline relative to the national trend over the same time period, the authors concluded that the testing had a negative effect. (“Reports Find Fault With High-Stakes Testing,” Jan. 8, 2003.)

Based on their analyses, the authors compiled a score card that tallied the number of states in which testing had a negative effect, the number of states in which the effect was positive, and the number in which the impact of testing was mixed or unclear. Because the number of states in the negative column exceeded the number in the positive column, they concluded that, on average, high-stakes testing is bad.

The so-called declines the authors used to categorize states into the winning or losing columns are often so small as to be meaningless, however. Consider, for example, the “strong” evidence that the implementation of high-stakes testing in New York had adversely affected school completion. From the sound of it, one would think that thousands of students had dropped out as a result of the testing. In fact, during the period after the 1994 introduction of graduation exams in New York, the state’s dropout rate didn’t increase at all—it remained flat, whereas during the same period, the national dropout rate declined by 1 percent. And what was the staggering drop in New York’s graduation rate following the introduction of the test? The rate declined by three-tenths of 1 percent during a time when graduation rates remained unchanged nationally. Nevertheless, on the basis of this “strong” evidence, New York ends up in the column of states whose students were ostensibly harmed by testing.

By the time one gets to the authors’ summary table, though, much less the hyperbolic press release that trumpeted the report, the actual sizes of the effects that are under discussion are long forgotten. In other words, the list of states where students were allegedly harmed by testing could include states whose indicators barely changed as well as those where they changed a great deal. In fact, there were many of the former and few of the latter. Indeed, of all the states whose graduation rates declined following the implementation of testing, none saw a decline that differed from the national average by more than 1.6 percent. Moreover, the average relative decline in graduation rates among states whose rates fell was smaller than the average relative increase in graduation rates among states whose rates rose. The data showing changes in achievement-test scores are equally meaningless, with the putative effects of testing usually smaller than the margins of error in the tests.

When a trend being analyzed is brief, it is easy to be fooled into thinking it is meaningful.

Social scientists generally are interested not only in the size of an effect, but in whether the result is statistically significant. In fact, nowhere do the authors of this report say whether the effects they have alleged to uncover are statistically significant, most likely because they are not. (I corresponded with Ms. Amrein and learned that no significance-testing had been done.) This is important, because findings that look impressive are frequently chance occurrences. When a trend being analyzed is brief, it is easy to be fooled into thinking it is meaningful. Suppose, for example, a coin I flipped four times in a row landed on heads each time. Would you be willing to believe that I had discovered a magic coin that always turned up heads, or would you want to see a few more flips? In the analyses presented in this report, not only are the effects often minuscule, few of the trends the authors describe are long enough to draw any reliable conclusions about the impact of testing on anything.

It is conceivable, of course, that implementing high-stakes testing could influence dropout or graduation rates, although the authors of this report, as well as those who funded it, will have a hard time explaining why, in several states, the trend lines point to declining dropout rates and rising graduation rates after the introduction of testing. (I don’t place much credence in these results, either, because they, too, are unlikely to be statistically significant.) But the authors’ contention that the implementation of high-stakes testing depressed students’ performance on tests like the SAT or ACT is just plain silly. Performance on these tests is strongly linked to students’ socioeconomic status and is marginally, if at all, affected by what takes place in the classroom.

And then, of course, there is what social scientists call the third-variable problem. During the period following the implementation of testing, plenty of other factors change as well, and many of these factors could conceivably influence dropout and graduation rates as well as achievement-test scores. Comparing each state’s trend to the national trend does not solve this problem, because factors that may have changed in a particular state may not have changed in the same way across the nation.

It is conceivable, of course, that implementing high-stakes testing could influence dropout or graduation rates.

One potentially important factor, for example, is the size of the state’s Hispanic population, because Hispanic youngsters drop out of school at a much higher rate than do other students. The two states where the relative increase in the dropout rate following the introduction of testing appears to be large enough to be worrisome—Nevada and New Mexico—are states with high and rapidly growing Latino populations. In fact, five of the eight states that showed a relative increase in their dropout rates following the introduction of testing are states with large Latino populations that grew dramatically during the time frame examined in the report (the other three are New York, Texas, and Florida). In all likelihood, this change in demographics, and not the implementation of testing, led to higher rates of dropping out and lower test scores.

A sensible reading of the evidence to date suggests that high-stakes testing so far has had neither the dramatic beneficial effects hoped for by its proponents nor the catastrophic ones feared by its detractors. But even this conclusion is not cautious enough. It will take many years, perhaps even decades, to assess the impact of such a dramatic change in educational policy and practice on student achievement.

Does high-stakes testing encourage teaching to the test? Probably. But this is not a problem if the tests that teachers are teaching to are measuring things we want our students to learn. As long as this is the case, there is nothing wrong with ensuring that students have mastered what we expect them to know before promoting them to the next grade level. How can anyone oppose that?

Laurence Steinberg is the distinguished university professor of psychology at Temple University in Philadelphia.

Related Tags:

Events

This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
Artificial Intelligence Webinar
Teaching Students to Use Artificial Intelligence Ethically
Ready to embrace AI in your classroom? Join our master class to learn how to use AI as a tool for learning, not a replacement.
Content provided by Solution Tree
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
Teaching Webinar
Empowering Students Using Computational Thinking Skills
Empower your students with computational thinking. Learn how to integrate these skills into your teaching and boost student engagement.
Content provided by Project Lead The Way
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
IT Infrastructure & Management Webinar
The Reality of Change: How Embracing and Planning for Change Can Shape Your Edtech Strategy
Promethean edtech experts delve into the reality of tech change and explore how embracing and planning for it can be your most powerful strategy for maximizing ROI.
Content provided by Promethean

EdWeek Top School Jobs

Teacher Jobs
Search over ten thousand teaching jobs nationwide — elementary, middle, high school and more.
View Jobs
Principal Jobs
Find hundreds of jobs for principals, assistant principals, and other school leadership roles.
View Jobs
Administrator Jobs
Over a thousand district-level jobs: superintendents, directors, more.
View Jobs
Support Staff Jobs
Search thousands of jobs, from paraprofessionals to counselors and more.
View Jobs

Read Next

Assessment Massachusetts Voters Poised to Ditch High School Exit Exam
The support for nixing the testing requirement could foreshadow public opinion on state standardized testing in general.
3 min read
Tight cropped photograph of a bubble sheet test with  a pencil.
E+
Assessment This School Didn't Like Traditional Grades. So It Created Its Own System
Principals at this middle school said the transition to the new system took patience and time.
6 min read
Close-up of a teacher's hands grading papers in the classroom.
E+/Getty
Assessment Opinion 'Academic Rigor Is in Decline.' A College Professor Reflects on AP Scores
The College Board’s new tack on AP scoring means fewer students are prepared for college.
4 min read
The United States Capitol building as a bookcase filled with red, white, and blue policy books in a Washington DC landscape.
Luca D'Urbino for Education Week
Assessment Opinion Students Shouldn't Have to Pass a State Test to Graduate High School
There are better ways than high-stakes tests to think about whether students are prepared for their next step, writes a former high school teacher.
Alex Green
4 min read
Reaching hands from The Creation of Adam of Michelangelo illustration representing the creation or origins of of high stakes testing.
Frances Coch/iStock + Education Week