Promise, Pitfalls Seen in Creating National Exams
The newly launched plan to establish a national examination system offers tremendous promise for improving American schools--if its sponsors can pull it off, assessment experts say.
Sponsors of the plan, which last month got off the ground with $2.45 million in foundation grants, are currently lining up states and districts to participate in the development phase. The plan is scheduled to be up and running at full steam by the year 2000.
If it is successful, say the sponsors--the National Center on Education and the Economy and the Learning Research and Development Center at the University of Pittsburgh--the exam system will encourage all students to reach high standards for performance, while at the same time allowing local schools discretion in how they teach toward those standards.
Dale Carlson, director of the California Assessment Program, said the balance between national standards and local flexibility is the project's greatest virtue--and its greatest challenge.
"Is it possible to have standards--which is what we are all after--and still not dictate what is taught?" he asked. "That's the tallest order to pull off."
Mr. Carlson and others noted that the groups face substantial technical hurdles in devising a way to ensure that performance is measured according to common standards. The task will be particularly difficult given the fact that the exam system will rely heavily on alternative methods of assessment, including projects and portfolios, that have yet to be used on such a large scale, the experts added.
In addition, they noted, the exam system must be coupled with other reforms, such as improvements in teacher training and the reporting of assessment results, to be effective in boosting performance.
Lauren B. Resnick, director of the LRDC, acknowledged that there are major issues to be resolved. But, she said, the purpose of the project's development phase is to demonstrate whether it is feasible at all.
"That's what the first 18 months is meant to be," she said. "It's an actual try to see if we can get the process started, and keep it true to its goals and intents."
Even if the groups manage to get the system in place, added Howard Gardner, professor of education and psychology at Harvard University, educators should not judge the success or failure of the effort prematurely.
"It took 100 years to get standardized tests to their current mediocre state," he said. "It would be Panglossian to say we can take these new exams and whip them into shape right away."
Moreover, Mr. Gardner added, "To convert the country to produce kids who can do this stuff will take decades."
"We have a quick-fix, sound-bite mentality," he continued. "If after two years we don't see anything different, we'll say it failed."
While the idea of creating some sort of national test is not completely new, the issue has moved rapidly up the education agenda in the past few months. (See Education Week, Sept. 26, 1990.)
Earlier this month, members of President Bush's advisory panel on education policy presented him with a plan to create national standards for student performance and tests to measure such performance.
This week, a new national organization, known as Educate America, was expected to propose the development of a battery of national achievement tests for all high-school seniors at public and private schools.
In addition, other groups, including the Secretary of Labor's commission on achieving necessary skills and the panel established by the National Governors' Association to monitor progress toward national education goals, have also discussed the idea of national student assessments.
The growing interest in national tests has made it more likely that such a concept might come to fruition, noted Marshall S. Smith, dean of the graduate school of education at Stanford University.
"Six months ago I would have said it was a fantasy," he said. "Now, something that cuts across states, has a common set of curriculum frameworks that drive an assessment system kids can study for, seems plausible."
To a certain extent, many of the various assessment proposals bear a common stamp--that of Ms. Resnick. In addition to directing the exam-system project, Ms. Resnick also serves as a member of the Labor Department panel and as chairman of the student-achievement resource group for the goals panel. She also met this month with Paul H. O'Neill, chairman of the President's advisory group.
But the rise of the issue also reflects the fact that the national education goals have heightened the need for a national system of measuring performance, according to Richard P. Mills, commissioner of education in Vermont.
"If [the goals] will stand, we have to have thoughtful ways of measuring performance," he said. "We are on the cusp of change in this country in the way we measure performance. Everyone agrees that the standardized tests of the multiple-guess variety don't measure the performance we are looking for. We have to scramble, we have to invent."
Ramsay W. Selden, director of the state education-assessment center for the Council of Chief State School Officers, said the different groups discussing national assessments should coordinate their efforts to ensure that they are striving toward the same goals.
"We have an opportunity to be thoughtful in referencing these programs to a single set of curriculum frameworks if that's what we [want to] get kids to learn," Mr. Selden said.
But Mr. Smith said such a move would be premature.
"There are as many conceptions of what a national test might be as there are people and groups" advocating one, he said. "For them to coalesce behind something is a giant step."
"I'd hate to see this end up on the scrap heap because we went at it too quickly and the warts started showing," Mr. Smith of Stanford added.
In contrast to the other proposals, which are still under discussion, however, the plan by the NCEE and the LRDC is already on its way toward implementation. Last month, the project was awarded grants from the John D. and Catherine T. MacArthur Foundation and the Pew Memorial Trusts. (See Education Week, Dec. 12, 1990.)
Under the proposal, the groups will create a system that would include three forms of examination--performance examinations, portfolios, and projects--to enable students to demonstrate mastery over the syllabus on which the examinations are based.
Although a national standards board, which would set standards for the system, would create the national examination, the system also would permit districts and states to develop their own examinations that met the standard. The national board would calibrate state and local examinations to the national standard, to permit observers to equate students' scores on those examinations to the national examination.
Officials of the sponsoring organizations say the plan would create in the United States the kind of standard for exiting secondary schools that is common in many European countries.
But George F. Madaus, director of the center for the study of testing, evaluation, and educational policy at Boston College, said transplanting such a system into this country may not have the desired effect of encouraging students to work harder in schools. National examinations, he noted, are only one part of the education systems in European countries.
"The argument that because Europe has it, and their kids are scoring better than ours on [international assessments], is incorrect," he said. "If you look to Europe, a national exam is part of an infrastructure that has been built up. You can't pick one piece out, export it, and think it's going to work the same way and have the same impact."
"The danger is, by emphasizing the national exam," Mr. Madaus added, "we won't consider other systemic issues that are really important."
For example, Mr. Selden said, the examination system must be accompanied by improvements in teacher training to ensure teachers can teach what the examinations will measure.
"I am becoming sensitive to the fact that using assessment as a strong signal of what is to be taught and learned is part of the picture,'' he said. "But if teachers don't understand how to teach [what is assessed], getting things changed over is going to be tricky."
Mr. Gardner added that schools should publish the examination questions, as European countries do, to provoke a national debate about student-performance standards.
"In Europe, exam questions are published on the first page of Le Monde, and are on the evening news," he said. "That changes the question--the question is not whether one ethnic group does worse, but why. If Asian kids don't write as well, is that because they don't have good ideas, good grammar, or a notion of what good writing is?"
Several experts argued, however, that the NCEE-LRDC plan includes several attractive features that distinguish it from other proposals and make it more likely to help raise the level of student performance.
Gregory R. Anrig, president of the Educational Testing Service, said the fact that the proposed assessment would be based on an agreed-upon syllabus will help schools improve what they teach.
"I don't believe you should let assessment drive instruction," Mr. Anrig said. "You should decide on instruction, and have the assessment rest on that decision."
The plan would also help coordinate efforts to develop performance assessments, which encourage the kind of instruction most educators agree should be emphasized, Mr. Gardner added.
"I'm personally convinced that having a lot of different efforts done at the same time to improve assessment is a luxury we can't afford,'' he said. "There needs to be a concerted national effort."
"Given that," he continued, "there is an opportunity for it to be not very good. This is the best chance it can be done in a responsible way. If something like this is designed and pulled off, it will create a school-leaving activity people all over the world will want to have."
But Chester E. Finn Jr., professor of education and public policy at Vanderbilt University, cautioned that performance assessment has yet to prove that it can be implemented on a large scale. The proposal, he said, "assumes a kind of assessment technology I don't think yet exists, and needs still to be developed."
"This will be a very high-stakes exam," he pointed out. "It can invite corruption and cheating. That's difficult enough with monitored exams. In things that develop over time, as in non-monitored settings, the opportunity for cheating and corruption multiply. So do issues of fair appraisal."
"I think it can be worked out," Mr. Finn said. "But I don't think it's solved yet. It's something that's a necessary part of the success of the system."
Edward Haertel, professor of education at Stanford University, also warned that the examination could become "corrupted" if it is used to hold schools accountable for student learning.
"I'm not sure you can design tasks that cannot be subverted and trivialized by teaching to them," he said. "That remains to be seen."
One way to avoid such problems, said Robert L. Linn, professor of education at the University of Colorado at Boulder, who is heading a technical committee for the sponsoring groups, would be to develop an examination that is "close enough to what you're trying to do in instruction."
However, he said, such a solution would not remove the danger of corrupting the test. "It's less of a concern, but not zero," Mr. Linn said.
Grant Wiggins, director of research for Consultants on Learning, Assessment, and School Structure, a Rochester, N.Y.-based group, also warned that political and fiscal pressures might force the proposal's sponsors to lower their sights and become less ambitious in their use of alternative forms of assessment.
"If it becomes something to 'cram for,' I'm not interested," Mr. Wiggins said. "Because of logistics and cost, it can be reduced to something more simple and feasible."
Ms. Resnick responded, however, that the demonstration project seeks to determine if the proposal can work as designed.
"There are many pitfalls and compromises that can drive it off the course it has in mind," she said.
In addition to helping improve instruction, noted Edmund W. Gordon, the John Musser professor of psychology at Yale University, the plan's reliance on alternative assessments allows students to use a variety of methods to demonstrate mastery on the examination.
As a result, he and others pointed out, teachers can use a variety of different approaches to teach to the common standards.
But, said Paul G. LeMahieu, director of the division of research, testing, and evaluation for the Pittsburgh Public Schools, such flexibility poses enormous technical problems. The sponsors must devise a way to ensure that all the examinations--including the state-level examinations that are to be calibrated with the national examination--are judged according to the same standards, Mr. LeMahieu said.
"This is a thoughtful response to a very difficult issue: Namely, how you reconcile a view of educational improvement that on the one hand suggests professionalization and empowerment of the school staff, and on the other hand seeks to impose accountability and a control mechanism from the outside," Mr. LeMahieu said. "The way the national exam system is proposed recognizes that. It's the only large-scale assessment system I know of that is constructed to preserving space where genuine professional practice and innovation can occur."
But, he added, "In order to pull it off, they set out for themselves what is probably their most difficult charge: Is it possible to calibrate a variety of different assessment approaches, and create a reasonably consistent application of standards?"
"We've never tried to do that," the Pittsburgh test director continued. "I don't know if it can be done. But it's a lot closer to what we need than what anyone is offering."
Mr. Linn of the University of Colorado said the task of judging different examinations is similar to one faced by university professors, who routinely evaluate term papers on differing topics.
"It will require judgment on the part of teachers to say, 'This is A-level work,"' he said. "I'm hopeful it can be done on a large scale."
Linking state-level to national examinations also poses enormous technical challenges, noted Mr. Selden of the CCSSO Statisticians must ensure that the examinations' content and administration are similar to ensure that the results can be equated, he said.
"It's possible to have a strong correlation between tests that shouldn't be correlated with one another," he said.
But Ms. Resnick, while acknowledging that devising a way to calibrate state and national examinations is "the biggest technical challenge we face," said the groups are after more than a statistical correlation. Rather, she said, the plan was aimed at ensuring that all students are judged according to the same standards.
"If we allow ourselves to get trapped into the statistical-equating meaning of 'calibration' that many assumed we meant," she said, "the whole thing might not be worth doing."
At least one psychometrician, though, suggested that what appear now to be technical constraints need not hamper the national-examination proposal.
"We need to develop different understandings about psychometric principles as this is implemented," Mr. Haertel said. "If we take it seriously at all, we have to recognize it is a radical proposal."
Vol. 10, Issue 19, Page 1, 18Published in Print: January 30, 1991, as Promise, Pitfalls Seen in Creating National Exams