The federal government must invest millions of dollars in improving the assessment of student performance to measure progress toward achieving national goals in that area, educators and measurement experts said last week.
President Bush and the nation’s governors signaled at their summit last week in Charlottesville, Va., that one of their key education goals for the nation is to improve student achievement.
Most observers agree that the National Assessment of Educational Progress, which for 20 years has measured student performance in a range of subject areas, is the most likely vehicle for gauging progress toward meeting that objective.
But the experts concede that costly improvements are needed if the program is to provide the kind of data necessary to measure progress. As primarily a multiple-choice test, they note, it measures only a narrow range of what students know and can do, and, at least for now, it offers limited guidance to state and local officials seeking to improve their students’ performance.
“I’m confident that, given an acceptable funding level, the field knows enough about good measurement to make naep really work,” said Marshall S. Smith, dean of the school of education at Stanford University.
“But all of the things [needed to improve the assessment] take development work,” he said. “A large part of the theory is there, but it will take money to carry it out.”
Despite such optimism, some critics remain skeptical that the assessment can be a useful measure of progress toward national goals. As long as it is used as a tool for holding schools accountable for results, those critics maintain, the test will force local administrators to concentrate on raising test scores, which may not result in improved student achievement.
But Ramsay W. Selden, director of the education-assessment center of the Council of Chief State School Officers, said a national test is essential to focus local schools officials’ attention on meeting national goals.
“If we declare those kinds of learning goals, we’d better develop an assessment to measure them or the goals will go by the wayside,” Mr. Selden said. “The message to local schools is: What’s important is what is tested. Goals are ineffective if they don’t follow what is measured.”
‘The Only Game in Town’
President Bush’s summit with the governors comes on the heels of growing efforts by states and the federal government to improve the way they measure student performance.
As a report issued last week by the Center for Policy Research in Education predicts, “measurement of key aspects of a state’s education system, such as the performance of students and the various factors that affect that performance, will continue to be a ‘growth industry’ as state leadership applies its hand to the challenge of improving schools.”
At the national level, suggests Steven S. Kaagan, the report’s author, the federal government has in place a highly respected measurement tool.
“There are all kinds of criticisms one can make of naep, but it does tell you what kids can and cannot do,” said Mr. Kaagan, a former commissioner of education in Vermont. “It’s the only game in town that can begin to tell us important information of what kids can know and do.”
The summit participants “would be stupid to start from scratch,” added Chester E. Finn Jr., the chairman of naep’s governing board. “This is its role in life--to tell the country how it’s doing.”
To develop a new assessment, added Mr. Finn, a former assistant U.S. secretary of education for educational research and improvement, would require “five years in development, and then we’ll get baseline data. It will be two years after that before we’ll know whether we’ve improved. It’s not something that can be wished into existence.”
Test Is ‘Evolving’
Nevertheless, he said, naep has its limitations, including its heavy reliance on multiple-choice questions.
“There are things multiple-choice tests can do,” said Senta A. Raizen, director of the National Center for Improving Science Education. ''They test factual knowledge, which is important--you can’t think without knowledge.”
However, she added, “there are things that such an assessment ipso facto cannot test, including the ability to walk into a situation, define a problem, and select an approach for solving it.”
The national assessment has experimented with broader measures of student performance, Mr. Finn said, noting that it requires writing samples as part of the writing assessment and will include some open-ended mathematics questions on next year’s test. And, he said, the agency will continue to “evolve” to include more sophisticated forms of assessment.
That evolution should continue “even if we didn’t have a national goal-setting exercise in Charlottesville,” he said. But, he cautioned, that process will be a long, slow one.
“Issues of standardization and comparability are embedded in this,” Mr. Finn said. “You know from an essay test how Iowa is doing. But to compare Iowa with Georgia or the national average will take a whole lot of prior agreement on what ought to be on those tests.”
As an alternative, Ms. Raizen cited a move in Britain to involve teachers in evaluating their students’ work as part of a national assessment.
“We have to re-establish some trust and faith in classroom teachers’ ability to evaluate what students do,” she said. “Teachers are in the business of collecting samples” of student work.
Advance the Schedule?
In addition to changing the assessment, the federal government could expand it more rapidly or enlarge sample sizes to make it more useful as a gauge of progress toward national goals, Mr. Finn said.
Under a law enacted last year, the agency next year will conduct a pilot state-level assessment of 8th-grade mathematics that will provide the first-ever state-by-state comparisons of student-achievement data. In 1992, the pilot will be expanded to include 4th-grade math and reading.
“If the President and the governors are serious about national performance goals,” Mr. Kaagan said, “does anyone want to advance the schedule?”
Increasing the sample size--another costly option--would also boost the test’s “analytic power” by enabling the assessors to draw more conclusions about sub-groups within states, Mr. Finn said.
Mr. Selden of the ccsso pointed out that the 1990 math assessment will provide much more data than any previous naep test. Unlike previous tests, he said, it will show how students within each state performed “on a full, ambitious range of mathematics--problem-solving, conceptual understanding, not just rote mechanical skills.”
“Naep will provide information back to schools they can use to evaluate curricular strengths andweaknesses,” he said.
But Carol Robinson, director of the department of planning, research, and accountability for the Albuquerque, N.M., Public Schools, said the assessment asks too few questions on each topic to help schools evaluate their programs.
“Naep would have to be a lot more extensive in what it covers to give us specific information about school programs,” she said. “It’s useful, as far as it goes. But it doesn’t go far enough to be helpful in the total curriculum.”
‘Negative Side Effects’
But while most observers agree that naep will most likely be used as part of the national goal-setting effort, some critics charge that such a use may be inappropriate.
Eva L. Baker, director of the Center for Research on Evaluation, Standards, and Student Testing, a federally funded research center based at the University of California at Los Angeles, said that using tests as tools for holding schools accountable “ends up having lots of negative side effects.”
For example, she suggested, schools can place too great an emphasis on raising test scores. Alternatively, encouraging schools to meet a numerical goal may “put a cap on performance.”
Ms. Baker admitted, though, that hers was a politically unpopular position.
“If we could do away with the need to make comparisons between school districts and between states, that might facilitate improvements in the measurement of student performance,” she said. “But without comparisons, we don’t have the incentive to do that.”
Terry Peterson, executive director of the South Carolina joint business-education subcommittee, advocated that the federal government use multiple measures of achievement to ensure that the test results reliably reflect achievement.
“It’s preferable to have several measures,” he said. “Any one of them could be faulty. If they all agree, you can be comfortable.”
While the test may not be a perfect gauge of performance, suggested Mark D. Musick, president of the Southern Regional Education Board, it can help states identify problem areas and begin to seek improvements.
For example, he noted, North Carolina officials began such a process this year when their state ranked last among all states using the Scholastic Aptitude Test, which most educators consider a poor barometer of state performance.
“Is the sat going to drive the curriculum in North Carolina? I doubt that,” he said. “Will there be increased attention to what is included on the sat--math and verbal skills? Probably.”
“Will educators ask themselves if courses cover what students need to know to do well on the sat? I don’t object to that kind of questioning,” Mr. Musick continued. “Those kinds of questions are beneficial.”
“To the extent that the national assessment leads to that kind of broad-scale questioning and introspection, that’s good,” he said.
A version of this article appeared in the October 04, 1989 edition of Education Week as Learning Goals Said To Demand Better Assessment