Assessment

Study Argues Test Policies Don’t Work

By Lynn Olson — April 24, 2002 6 min read
  • Save to favorites
  • Print

High-stakes testing is a “failed policy initiative” that does not produce gains on other measures of student learning, researchers at Arizona State University in Tempe argue in a recent paper.

Read the full text of the article, “High-Stakes Testing, Uncertainty, and Student Learning,” from the Education Policy Analysis Archives.

“High-Stakes Testing, Uncertainty, and Student Learning,” by Audrey L. Amrein and David C. Berliner, appears in last month’s edition of the online scholarly journal Education Policy Analysis Archives.

It examines data from 18 states that attach high stakes to their test results. Such states, for example, use test scores to determine promotion from one grade to the next, graduation from high school, rewards for high-performing schools, and consequences for low-performing ones.

To see whether states that adopted high- stakes practices showed gains on other measures of student learning, the researchers conducted a “time-series analysis” in which they looked at scores obtained over two decades from four separate standardized tests. In particular, they examined changes in three college-admissions or -placement tests—the SAT, the ACT, and the Advanced Placement exams and the National Assessment of Educational Progress.

The researchers examined changes in SAT scores from 1977 to 2001, in ACT scores from 1980 to 2001, in AP scores from 1995 to 2000, and in NAEP reading and math scores from 1990 to 2000. For each state, they looked at whether those scores rose or fell in the years after the state required the first high school class to pass an exam to graduate, by analyzing short-term, long-term, and overall achievement trends.

“Analyses of these data reveal that if the intended goal of high-stakes-testing policy is to increase student learning, then that policy is not working,” the authors conclude. “While a state’s high-stakes test may show increased scores, there is little support in these data that such increases are anything but the result of test preparation and/or the exclusion of students from the testing process.”

In particular, the authors found:

  • Twelve of the 18 states posted overall decreases in ACT performance after high-school exit exams were implemented, which were not related to changes in the proportion of students taking the exam. Ten of the states with graduation exams posted overall decreases in SAT performance after the tests were in place. Those decreases were slightly related to changes in SAT participation rates.
  • Participation rates in ACT testing—an indicator of whether more students were motivated to attend college—increased in nine of the states, decreased in six, and stayed the same in three after the imposition of high-stakes exit tests. Participation rates on the SAT, compared with the national average, fell in 11 of the states with graduation exams.
  • States with high school graduation exams also had a decrease in the percent of students who passed AP tests, after controlling for student-participation rates.
  • Gains and losses on NAEP mathematics tests in grades 4 and 8 were more strongly related to changes in the percent of students excluded from NAEP in each state than to whether states used high- stakes testing. If anything, the authors found, the weight of the evidence suggests that students from states with high-stakes tests did not achieve as well on the grade 8 math NAEP during the 1990s as students in other states did.
  • The one place where some states made “real” gains—that were not affected by changes in participation rates—was for the cohort of students moving from the 4th to the 8th grade and taking the 1994 and 1998 NAEP reading exams. Gains in scores were posted 2.3 times more often than losses in the states with high-stakes-testing policies.

The authors argue, however, that during the same period, many states and districts also launched reading- curriculum initiatives. “Because of that, it is not easy to attribute the gains made for the NAEP reading cohort to high-stakes-testing policies,” they write. “Our guess is that the reading initiatives and the high-stakes testing are entangled in ways that make it impossible to learn about their independent effects.”

Training, Not Learning

“I think the article is important because it points out the confusion between training and learning,” Mr. Berliner, a professor of education at Arizona State University, said by e-mail last week. “Let me give you an example: You can teach almost any kid to play ‘Chopsticks’ on the piano. But by doing that, have you taught the child to play the piano? Does that qualify those kids as musicians? I don’t think so.”

“High-stakes tests are like playing ‘Chopsticks,’ ” he added. “Sure, you can get a great performance out of the kids. But so what? They cannot play the piano! Through test preparation, drill, narrowing the curriculum, and excluding the kids that are English-language-learners and who are in special education, you can increase scores on the state test. Any state test. But our data demonstrate that the students’ scores in the domains that the state’s tests are representing (reading, language arts, mathematical reasoning) did not change as they were supposed to. Our conclusion is that all we have so far is ‘Chopsticks’! Training, but no learning.”

In a separate, forthcoming paper, the authors are reviewing research on the potentially negative consequences of high-stakes testing, including teaching to the test, narrowing of the curriculum, and cheating.

Richard L. Allington, a professor of education at the University of Florida, praised the Education Policy Analysis Archives article for providing a “massive set of data” that shows current testing policies are not working.

But in an Internet posting, Chester E. Finn Jr., the president of the Washington-based Thomas B. Fordham Foundation, criticized the article as being “more hatchet job than careful social science.”

Mr. Finn asserted that college-admissions tests such as the SAT are not taken by all students and are less apt to be influenced by state accountability policies aimed at low-performing students and schools.

“The weakness of our study is the same as all studies that attempt to show transfer; namely, it is hard to do,” responded Mr. Berliner. But he added: “If it is so that SAT and ACT and other tests we use are not good measures of transfer, then why the hell do we spend tens of millions of dollars and hundreds of millions of person hours on them every year? Let us be clear: Either these tests measure a sample of the important things that schools are supposed to teach, or we are a nation of idiots for giving these tests year in and year out.”

Mr. Berliner advocated having teachers, with the help of academic-content experts, design state tests to meet state standards, score the tests, and then meet to discuss the tests and standards.

He also said that while NAEP can be used to measure the transfer of student learning, students should not be excluded from the national assessment, a federal program that tests representative samples of students. Pertinent information on student characteristics should be provided, however, to help interpret NAEP results, according to the researcher.

Alabama, Florida, Georgia, Indiana, Louisiana, Maryland, Minnesota, Mississippi, Nevada, New Jersey, New Mexico, New York, North Carolina, Ohio, South Carolina, Tennessee, Texas, and Virginia were examined in the study.

A version of this article appeared in the April 24, 2002 edition of Education Week as Study Argues Test Policies Don’t Work

Events

This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
Student Achievement Webinar
How To Tackle The Biggest Hurdles To Effective Tutoring
Learn how districts overcome the three biggest challenges to implementing high-impact tutoring with fidelity: time, talent, and funding.
Content provided by Saga Education
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
Student Well-Being Webinar
Reframing Behavior: Neuroscience-Based Practices for Positive Support
Reframing Behavior helps teachers see the “why” of behavior through a neuroscience lens and provides practices that fit into a school day.
Content provided by Crisis Prevention Institute
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
Mathematics Webinar
Math for All: Strategies for Inclusive Instruction and Student Success
Looking for ways to make math matter for all your students? Gain strategies that help them make the connection as well as the grade.
Content provided by NMSI

EdWeek Top School Jobs

Teacher Jobs
Search over ten thousand teaching jobs nationwide — elementary, middle, high school and more.
View Jobs
Principal Jobs
Find hundreds of jobs for principals, assistant principals, and other school leadership roles.
View Jobs
Administrator Jobs
Over a thousand district-level jobs: superintendents, directors, more.
View Jobs
Support Staff Jobs
Search thousands of jobs, from paraprofessionals to counselors and more.
View Jobs

Read Next

This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
Assessment Sponsor
Fewer, Better Assessments: Rethinking Assessments and Reducing Data Fatigue
Imagine a classroom where data isn't just a report card, but a map leading students to their full potential. That's the kind of learning experience we envision at ANet, alongside educators
Content provided by Achievement Network
Superintendent Dr. Kelly Aramaki - Watch how ANet helps educators
Photo provided by Achievement Network
Assessment Opinion What's the Best Way to Grade Students? Teachers Weigh In
There are many ways to make grading a better, more productive experience for students. Here are a few.
14 min read
Images shows colorful speech bubbles that say "Q," "&," and "A."
iStock/Getty
Assessment Spotlight Spotlight on Assessment
This Spotlight will help you evaluate effective ways to offer students feedback, learn how to improve assessments for ELs, and more.
Assessment Opinion To Replace Skill Mastery for Seat Time, There Are 3 Requirements
Time for learning and student support take on a whole new meaning in the mastery-based learning model.
4 min read
Image shows a multi-tailed arrow hitting the bullseye of a target.
DigitalVision Vectors/Getty