Error Affects Test Results In Six States
A national testing company's error that mistakenly landed more than 8,000 New York City students in summer school has affected test results in at least five other states.
|The chart below shows the number of students taking the test in the states and district where errors have been discovered.|
|Jurisdiction||No. of Students|
|New York City||100,000|
CTB/McGraw-Hill this month notified educators in Indiana, Nevada, South Carolina, and Wisconsin that the percentile rankings of some students who took the company's popular TerraNova tests may be incorrect.
The glitch, which is having unusually widespread ramifications, also influenced results in Tennessee. But testing officials there caught the error last spring before scores were sent to students' homes and schools.
In those six states combined, more than a million students each year take the tests, which are an updated version of the Monterey, Calif., company's well- known Comprehensive Test of Basic Skills.
The consequences of the error were particularly serious in New York City, where school officials used the tests this year for the first time to decide which students to send to summer school and which to hold back a grade. Of the 8,700 students erroneously sent to summer school, nearly 3,500 were also held back unfairly, embarrassed administrators revealed two weeks ago. ("Summer School: Amid Successes, Concerns Persist," Sept. 22, 1999.)
In Nevada, state educators used the tests to identify which schools were "demonstrating need for improvement"--a categorization that carries with it both bad publicity and added technical and financial resources. Administrators were scrambling last week to determine whether any of the five schools across the state tagged with that label this year might not have deserved it.
McGraw-Hill, the New York City-based publishing giant that owns CTB, has apologized for the errors and is working to adjust the scores.
But testing experts said such mistakes should raise a cautionary flag for states and districts about the wisdom of using standardized tests--particularly a single test--as a tool for making educational decisions with serious consequences for individuals and schools.
"These kinds of mistakes happen," said George F. Madaus, a Boston College education professor. "It shows the fallibility of the technology and the need to use more than one test."
Despite the mistakes, state policymakers seem unlikely to back away from such tests, which have become increasingly important tools in the nationwide drive for greater educational accountability.
Pushing for Results
Political pressure to get tough with lagging schools and students has led most states in recent years to rely on test scores to reward good results and punish failure. In that kind of pressure-cooker environment, which seems likely to continue, mistakes may be even more common in the future, said Eva L. Baker, a co-director of the Center for Research on Evaluation, Standards, and Student Testing at the University of California, Los Angeles.
"It seems to me we're noticing more difficulty with these tests in the administration and management and scoring of them. I assume that's because the time schedule to turn these things around is much more compressed than it used to be," Ms. Baker said. "Now that some of these tests have consequences for kids and systems, we're finding out about relatively small errors that never used to matter much."
This past summer, for example, a computer error forced CTB's biggest rival in the test-publishing business, the San Antonio-based Harcourt Brace Educational Measurement, to rescore California's statewide achievement tests.
And earlier this month, Washington state educators asked Riverside Publishing Co. of Itasca, Ill., to reanalyze the writing exams taken by 204,000 students because the results on one section looked suspiciously low.
In addition, three of the states affected by the latest mistake--Indiana, Tennessee, and Wisconsin--have experienced other scoring problems with CTB in recent years.
The newest errors, which are limited to the version of the TerraNova known as form B, do not alter students' raw scores in any state. Instead, they stem from data-processing errors that occurred when company researchers were translating the raw scores to percentile rankings, which show how a student measures up against everyone else who took the test.
To do that, researchers compare the scores with those of a smaller national sample of 150,000 test-takers. The computer errors were found in 3 percent of the data from that comparison group.
"It was a minute amount of data in a larger study, and that's why it took a long time to identify," said William Jordan, a McGraw-Hill spokesman.
Because of the mistake, the rankings of many students who scored at the low end of the tests were underestimated. To a slightly lesser degree, high-scoring students' scores may also have been inflated. Test scores for students scoring in the middle range are not expected to change.
Mr. Jordan said questions about the credibility of the percentile rankings were first brought to the company's attention last spring by testing directors from Tennessee and New York City. But, while Tennessee opted to delay releasing its students' scores by two months, New York City officials reported their results on schedule. The scores were then used to make students' summer school assignments.
"We had been assured twice by CTB/McGraw-Hill that there was nothing wrong with the data," said Karen Crowe, a spokeswoman for the New York schools.
For Nevada officials, meanwhile, news of the errors came as a surprise. In that state, the tests determine which schools will be put on probation--a status that also qualifies them for a share of $3 million in state aid. To get on the list, 40 percent or more of a school's students must perform in the bottom quartile of the tests in four subjects for two consecutive years.
Glenn Duncan Elementary School, a school in urban Reno with high concentrations of poor and minority students, had reading scores just a few percentile points away from the cutoff when it was placed on probationary status. Now, school administrators are anxiously waiting to see whether their status will change.
"To be on the front page is awful because we're all here working as hard as we can," said June Hall, the school's principal. "The other side of it is that we've had a lot of support that we didn't have before."
State education officials expect to get corrected scores by Oct. 20. Until then, they will hold off deploying intervention teams to work in failing schools. But Duncan Elementary and other schools that have already received funding will be allowed to keep their extra state aid.
Beyond the added expense of mailing out new scores, the mistakes are expected to have little, if any, impact in the other affected states. South Carolina, for example, gives the TerraNova only to a representative sample of students as a means of determining how the state measures up against other states and a check against its own achievement tests.
"When you begin to hear all the problems this is causing, and you begin to hear the words 'promotion' and 'retention,' you have to remind people that these tests contain substantial standard errors," said Benjamin Brown, Tennessee's testing director. "I think teachers having a student all year would certainly have something more substantial to say than a state achievement test."
It's unlikely, however, that the errors will diminish the popularity of CTB's testing products. Of the "Big Three" test publishers--CTB, Harcourt Brace, and Riverside--CTB is thought to control the biggest share of the market.
A report by Simba Information Inc., a business-research firm in Greenwich, Conn., estimates that the company's sales in 1997 were $80 million.
States and school systems are particularly attracted to the TerraNova, which was introduced in 1996, because it has the potential to report scores by achievement levels. If they dropped those tests now, they would lose valuable longitudinal data showing whether students are making progress over time.
Given the 72-year-old company's generally good record, most of the state testing directors said their states would likely stay the course with CTB--at least for the next year or two.
But, added Gary Cook, the testing director in Wisconsin, "I will certainly be looking over their shoulders to make sure it doesn't happen again."
Vol. 19, Issue 5, Pages 1,13-15