That Sinking Feeling
Bad tests and overreliance on test results are enemies of good standards. Just look at what happened in Atlantis.
To grasp the impact that high-stakes tests are having on school reform in this country, imagine a 51st state. Let's say it's an island, the state of Atlantis. Because this imaginary island-state is slowly sinking into the sea, its schools teach only one subject: swimming. When the movement for higher standards reached Atlantis, it set off a frenzy of reform debate, a perfect mirror of what is happening in our 50 states.
First Atlantis set about creating content standards—what every student needs to know about swimming. Some adults on Atlantis wanted a simple "survival swimming" curriculum, which would teach children to tread water. Others wanted a rich curriculum in which all students would be taught the four competition strokes: butterfly, crawl, backstroke, and breaststroke. Rather than make hard decisions, it proved easier to include just about everything, and so Atlantis schools now teach everything from the dogpaddle on up.
Establishing performance standards came next. Performance standards set the levels of skill and knowledge a student must demonstrate to pass or to earn distinction. After prolonged debate, Atlantis adopted four performance levels: advanced, proficient, basic, and below basic. "Advanced" swimmers must be able to swim four laps—one of each competition stroke—in less than nine minutes, while "proficient" swimmers must complete the laps in under 15 minutes.
Atlantis decided that "basic" swimmers would earn diplomas simply by finishing, no matter how long it took. This last decision angered educational conservatives, who accused Atlantis education officials of watering down the curriculum.
Students who fail to complete the four laps are in the "below basic" category. They either drown or drop out of school.
Atlantis simply ignored opportunity to learn standards. The idea behind this third type of standard is that it is unfair to hold students to performance standards unless conditions exist to give them a fair chance of meeting them. So, for example, for all Atlantis students to have a realistic chance of mastering the four strokes, they would be taught in lap pools by trained swimming coaches. However, trained swimming coaches are hard to find on Atlantis. Most schools have access to water (it's an island, after all), but swimming instruction is generally left to adults who swim recreationally.
Measuring learning was the next hurdle. Teachers wanted classroom grades and their professional judgment to count, but Atlantis decided that students had to prove they could swim. Then, however, they couldn't agree on how to test them. Requiring each student to swim a lap of at least one of the four strokes would be what educators call a valid test, just as testing writing skills by requiring students to write an essay is valid. Student flaws could be identified precisely and corrected immediately, and students could jump back in the pool and try again. Unfortunately, direct demonstration of skill is costly and time-consuming, and so Atlantis opted for a machine-scored, multiple-choice test of swimming. It ruled that students had to pass this test in order to graduate.
This decision had unintended effects on the curriculum at many Atlantis schools. Reporters found that students were swimming less, drilling more. And many schools eliminated enrichment programs in water polo, diving, and synchronized swimming in order to focus on the basics.
Responding to protests that a multiple-choice exam was not a valid test of swimming, Atlantis added an essay question: "In 250 words, describe what it feels like to swim a lap of the butterfly, with particular attention to the turn."
Parents in some wealthy Atlantis communities threatened to boycott the high-stakes test. As one Atlantis parent put it, "Our children learned to swim at home. We want our children playing water polo and learning to dive, not wasting time on drill."
Low-performing Atlantis students, many of them disadvantaged, seem to have benefited from multiple-choice exams, according to Atlantis. "Since the adoption of the multiple-choice format, drowning as a reason for nonattendance has been virtually eliminated," the report asserts. School records confirm this; last year only a handful of students stopped coming to school because they had drowned, all of them students in the "below basic" classes. The dropout rate, however, remains high.
Every seemingly precise standardized- test result has an error range, called the standard error of measurement. Thus, for example, on a test with an error range of 4 points, a score of 72 means the student's so-called true score lies somewhere between 68 and 76. And so, reliance on machine-scored, multiple-choice tests always raises questions about the reliability of the results. Detailed analysis of last year's test results on Atlantis revealed that dozens of students passed at the "basic" swimmer level and earned diplomas even though they could barely keep their heads above water. Some certified lifeguards were denied diplomas because they scored at the "below basic" level. The testing company made errors in scoring exams, leading to more kids' being denied promotion or graduation.
Atlantis may be imaginary, but everything I've imagined happening there is real here. Content standards are often bloated, and performance standards have been lowered in Arizona, California, New York, New Jersey, New Mexico, and other states in the face of high failure rates. States have ignored opportunity-to-learn standards, because it's too painful to confront the fact that, for example, more than one-third of our high school students are taught science and math courses by adults who neither studied nor trained to teach those subjects. No imaginary Atlantis educators have been caught cheating, but real ones have been fired or indicted in Maryland, Texas, Connecticut, and other states.
Most of our states rely on machine-scored, standardized tests, often resulting in cuts in "frills" like art, music, and physical education, and more drill. Twenty-four states have adopted a high-stakes graduation test, with more likely to follow suit. And here at home, test boycotts are a reality: Parents in well-to-do communities like Scarsdale, N.Y., and Berkeley, Calif., kept their children home on test days in the spring of 2001.
Because standardized tests are not precise instruments, so-called "false negatives" and "false positives" are also a fact of life. By Massachusetts's own calculation, it's statistically likely that about 3,000 of the 10th graders who failed the state math test in 1998 actually passed. Errors by testing companies penalized students in Arizona, Minnesota, Indiana, Massachusetts, and New York.
On Atlantis, and here in the United States, bad tests and the overreliance on test results are enemies of good standards. Here and on Atlantis, it's students who suffer the most, not the politicians and other adults who make the policies. Atlantis is, of course, sinking slowly into the sea, meaning that its bitter arguments over educational standards and high-stakes tests will soon be over.
The rest of us should be so lucky.
John Merrow, the author of Choosing Excellence: "Good Enough" Schools Are Not Good Enough (Scarecrow Press, 2001), recently received a Peabody Award for his PBS documentary "School Sleuth: The Case of an Excellent School." His next program, "Testing the Schools," will appear on PBS's "Frontline" Oct. 25.
Vol. 21, Issue 7, Pages 32, 35Published in Print: October 17, 2001, as That Sinking Feeling