Education Commentary

Reconsidering Standards and Assessment: Performance Assessment: An Emphasis on ‘Activity’

By Ruth Mitchell — January 24, 1990 7 min read

The commitment of President Bush and the nation’s governors to establish national goals for education raises the question of how progress toward such goals should be assessed. If we intend to measure the academic achievement of students and programs, norm-referenced, multiple-choice tests will not do. For a more accurate gauge--one that won’t narrow the curriculum-we need performance assessments.

Also called “authentic” or “alternative” assessments, these forms of evaluation directly measure actual performance in academic subjects. Norm-referenced, multiple-choice tests, in contrast, measure only test-taking skills directly, and little but good guessing indirectly.

Performance assessment focuses on activity, as opposed to the passive bubble-filling of multiple-choice tests. Its most familiar form is writing, already a standard feature of program assessment in 28 states. A writing component is currently being tested by the Educational Testing Service for use in the Scholastic Aptitude Test. Other techniques of authentic evaluation are also gaining ground. Open-ended questions in mathematics require students to write extended answers or draw their response to a problem. The California Assessment Program has included these kinds of problems in its 12thgrade test for the past two years, and in 1990, two-sevenths of the National Assessment of Educational Progress mathematics questions will be open-ended.

New York State assessed the science skills of its 4th graders in May 1989 with the performance- based Manipulative Skills Test. Connecticut has prepared a spectrum of performance tests, including vocational-skills assessments that were designed with the help of students’ potential employers-business and industry. That state is now developing science and mathematics performance assessments that will evaluate its “common core of learning.”

And the use of portfolios--collections of student work in any subject--is winning widespread attention. Vermont now includes them in its state assessment. The E.T.S. is developing workshops on portfolios in writing, and Harvard University’s Project Zero is working with arts teachers in Pittsburgh public schools on portfolios of creative writing. This approach could prove to be the most powerful of all performance assessments, if a reliable way of using it for accountability can be devised.

The Coalition of Essential Schools is spreading the use of “exhibitions” that demonstrate student mastery of whole curricular units. The Matsushita Foundation recently held a conference to urge its partnership schools in seven states to change to performance assessment as part of restructuring.

The variety of performance assessments is itself a virtue. Multiple-choice tests are all the same, whether they measure grammar, geography, mathematics, or reading, evoking from students glazed ennui, if not physical absence. But performance testing has what experts call “face validity": There is an obvious correspondence between one’s understanding of a subject and the means of testing it. Since subjects differ, so do tests. They can also differ within a subject: Some history assessments are essays; others are mock trials.

There are two other reasons why performance assessments are gaining favor: They evaluate thinking, a major concern of American businesses dismayed by employees who cannot solve simple problems; and they enlarge rather than constrict the possibilities for classroom teaching.

It is a simple fact of life that assessment drives teaching: What gets tested is what gets taught. This is fine when testing involves, for example, the writing of essays that require the thoughtful combination of concepts with a wide range of details. But when teachers know that their students will only have to fill bubbles with No.2 pencils, they concentrate on teaching discrete facts. Multiple-choice testing and it· corollary, multiple-choice teaching, are major contributors to the boredom that pervades American classrooms.

In describing the undesirable tests, I have carefully avoided the term “standardized.” Performance assessments can and should be standardized- in the sense of providing standards of achievement. Grading is not difficult with performance assessments. The products-whether written, drawn, recorded, or even videotaped- are scored according to a rubric that lays out the qualities required for each grade Even when scores are assigned by direct observers of the performance--in the case, for example, of science experiments scrutinized by trained teachers--the results can be aggregated and statistically manipulated, just as conventional test scores are.

A related benefit of performance assessment lies in the common practice of having groups of teachers grade the tests: These teachers gain the enormous advantage of seeing the effects of their instruction. Because they thus face fundamental questions of teaching, their participation in the design, administration, and scoring of performance assessments is perhaps the most effective form of professional development. The money that a state or school district would have spent on commercial tests can be put into assessment strategies that foster professional growth. It’s a short cut to teacher empowerment.

This extra benefit will be lost if districts buy the packaged “performance assessments” that test publishers--who, like the E.T.S., have seen the handwriting on the wall--are now issuing. Many educators will have noted, for example, the colorful advertisement for the Stanford Writing Assessment Program offered by the Psychological Corporation, Harcourt Brace Jovanovich. According to the ad, which associate writing with jungle adventure, this testing package uses holistic and analytic scoring, and returns the results to teachers as a basis for language-arts activities. But the teachers will not have read the papers or enjoyed the Intellectual challenge of developing scoring guide to capture the essence of good writing.

The federal government has also sensed the wind. It now mandates nationally norm-referenced tests for reporting progress in Chapter 1 programs. But it has a committee of psychometric experts looking at possible equivalence between the required tests and the performance assessments state and local authorities might wish to use.

NAEP is moving toward performance assessment, but the elements it has incorporated so far--writing samples, open-ended task in mathematics and reading, and even a “writing portfolio~ in 1990--are best characterized as tokens. The assessment is not yet the flexible, comprehensive instrument needed to authentically reflect progress toward national goal. In fairness, NAEP would need adequate funding from the Congress to do the job. Its puny budget is $4 million short of the amount needed to analyze data collected in 1990.

California is the first state to have declared a policy of shifting its testing program to performance assessments. Regional and county administrators have attended conferences to introduce them to the idea and enlist their help with the experiments and field tests needed to develop the tests. The development of the 8th-grade writing assessment has set a precedent for grassroots participation. In the handbook for the tests, four pages are needed to list the names of all the teachers and administrators who assisted with its development, field testing, and annual scoring.

A forthcoming statement to be signed by at least 20 organizations will urge President Bush and the governors not to use multiple-choice tests to measure progress toward national goal”, but to require performance assessments. The cover letter will be signed by the Center for Fair and Open Testing, the Council for Basic Education, and the National Association for the Advancement of Colored People.

What may make the climate more favorable now for the shift to performance assessment is the widespread perception that teaching and testing are out of sync. The writing process introduced across the country by the National Writing Project has made multiple-choice tests of writing laughable. The pressure to introduce “thinking skills” has become pressure to evaluate them appropriately. The spate of revelation about what our students don’t know has forced examination of what they’re being taught and how.

A switch to performance assessment has the potential to benefit American education in three ways: It reveals the presence or absence of thoughtfulness and understanding, not simply of memorization; it requires the teaching of a thinking curriculum to all students; and, by involving them in assessment, it empowers teachers.

A version of this article appeared in the January 24, 1990 edition of Education Week