San Francisco--A West Virginia physician was correct when he found that most students in all states scored above average on standardized achievement tests, a team of leading experts conducting a federally funded study reported here last week.
But the extent to which states and districts report above-average scores is not as great as the physician, John Jacob Cannell, suggested, the researchers said.
And they dismissed Dr. Cannell’s contention that the high scores reflect a conspiracy among test publishers and administra6tors to mislead the public. Rather, they said, the scores result from outdated norms and efforts to “teach to the test.”
Responding to the findings, Dr. Cannell--who is currently studying psychiatry in Albuquerque--said he felt “vindicated.”
“I feel my main points have been corroborated,” he said.
Daniel Koretz, senior social scientist at the rand Corporation, said the new study “removes one of the irritating irrelevancies around the previous study, [namely] the ad hominem arguments against Cannell’s lack of background.”
“He was clearly right about his basic conclusion,” Mr. Koretz said. “With respect to national averages, districts and states are presenting inflated results and misleading the public.”
But Leigh Burstein, one of the researchers who conducted the new study, said the focus on whether or not Dr. Cannell was correct had deflected attention from states’ efforts to present a more accurate picture of student achievement.
“If the super-concern about what proportion of kids are above average dominates the discussion, we have lost the battle,” said Mr. Burstein, professor of education at the University of California at Los Angeles.
“Some states are doing an exemplary job of communicating how all the kids in their state are doing,” he said. “You can’t get that from John Cannell’s numbers.”
Dr. Cannell’s study, released in November 1987, was believed to be the first ever to examine states’ and districts’ results on nationally normed achievement tests. On such tests, students are compared against a norming sample who took the test as many as seven years earlier.
Based on a survey conducted for a few hundred dollars, the study found that “no state scores below the publisher’s ‘national norm’ at the elementary level on any of the six major nationally normed, commercially available tests.”
A survey of 3,503 districts, moreover, found that 2,857--or 82 percent--scored above average on the tests.
These findings quickly became known as the “Lake Wobegon effect,” after the writer Garrison Keillor’s mythical town in which “all the men are strong, all the women are good-looking, and all the children are above average.”
The report created a firestorm within the testing community, and focused national attention on an increasingly controversial issue.
“Our community should be extremely, although probably belated4ly, grateful to John Cannell for the attention he has mobilized on the question of standardized testing,” said Eva L. Baker, director of the center for research on evaluation, standards, and student testing at the University of California at Los Angeles.
In response to the outcry, the U.S. Education Department held a meeting last year to investigate the Cannell study’s findings. The participants, who included leading test publishers and experts in the assessment field, agreed that the physician’s conclusions were generally accurate.
But to obtain a clearer picture, the department agreed to fund a project to replicate his study, and provided $80,000 to the ucla center for the work.
In the new study, Robert L. Linn, professor of education at the University of Colorado, surveyed the 35 states that compare test scores against national norms. Unlike Dr. Cannell, who collected the most recent year’s overall scores for elementary students, Mr. Linn obtained data on reading and mathematics scores for each state in each grade level for three years.
According to Mr. Linn, the survey found that in each state “the overall percent of students above the national median is greater than 50 in all the elementary grades in both reading and mathematics for each of the three years studied.”
But the subject-area and grade-level results suggested greater variability in scores, the researcher added. The scores tended to be higher for math than for reading, and elementary students tended to post higher scores than secondary students.
In addition to the state survey, Mr. Linn also surveyed a random sample of 175 districts. Although analysis of the sample data is still under way, he said, preliminary results suggest that “it is clear it is more common for districts to report above-average than below-average scores, although the findings are not as extreme as Cannell suggested.’'
In an attempt to determine the cause for these high scores, Mr. Linn analyzed other indicators of student achievement to gauge whether the scores reflected improvement in achievement over time.
Several test publishers have contended that because of educational improvements, student abilities now surpass those demonstrated when the norms were set.
“It is clear that, if you use old norms, that is one factor that would contribute to the results,” said Mr. Linn. “If you take a current national average, and compare it to an old average, it will look like it is above average.”
Mr. Linn conceded that the question of whether the gains reflect genuine improvements is “hard to answer.”
But other indicators of achievement, such as results from the National Assessment of Educational Progress, have risen much less sharply than the results of normed tests, he noted, suggesting that the gains are not due solely to gains in achievement.
According to Lorrie A. Shepard, another author of the study, some of the gains reflect efforts by schools to adjust their curricula and instruction to match the items on the test.
Such efforts have intensified in recent years, she noted, as 40 states have turned to “high-stakes testing” programs, in which test scores are used to rank schools or influence decisions about students or teachers.
“Intense pressure on educators to improve scores,” she said, “sets the stage and increases the incentives for the various types of teaching to the test.”
Surveys of state testing directors, said Ms. Shepard, professor of education at the University of Colorado, found that most states select tests that match their curricula; many teachers emphasize the test objectives; and many schools employ test-preparation strategies.
Moreover, she noted, teachers familiar with test items from prior years can teach those items to their students. On some tests, she pointed out, “if half the class already knows the vocabulary words the teacher has remembered, then the teacher only has to be sure that the rest of the class learns two vocabulary4items to increase the class standing on the vocabulary subtest by five percentile points.”
To improve the accuracy of test scores, Ms. Shepard recommended that publishers develop fresh tests every year.
Although such a solution would be expensive, she said, publishers could hold costs down by testing fewer grade levels or different subjects each year.
But Ms. Baker of ucla suggested that states should consider new types of testing. Such tests, she said, should enhance, rather than distort, classroom practices.
“Instead of thinking about the impact of Cannell’s analyses as the ‘emperor’s new clothes’ metaphor,” she said, “I propose we think about deposing the emperor.”
“Measures that tap into the way students learn,” she said, “that sample domains adequately, and that don’t force teachers and principals to focus on the test itself but empower them to learn instructional approaches that will allow students to learn the critical subject matter and skills are what we should be about.”
A version of this article appeared in the April 05, 1989 edition of Education Week as Physician’s Test Study Was ‘Clearly Right,’ A Federally Sponsored Analysis Has Found