Education Commentary

Beyond the Verbal Confusion Over ‘Tests’

By Ruth Mitchell — April 29, 1994 7 min read

A verbal confusion is muddying the debate about a national system of standards and assessments: The words “tests’’ and “assessments’’ are being used as if they were synonymous. Because the word “test’’ fits better into headlines, the press is particularly prone to this confusion. It has to be cleared up if there is to be any clarity about what we can expect from national standards and assessments.

Briefly, tests are machine-scorable, usually multiple-choice and norm-referenced. Their essence is speed, cheapness, and psychometric respectability. Assessments, which is shorthand for “authentic,’' “alternative,’' or--my preferred term--"performance assessments,’' are not machine-scorable, and vary in length and in what they require students to do. Their essence is the active production of response by the student.

Almost no one concerned with the legislation which may result in a National Education Standards and Assessment Council now supports a national test. The prevailing notion envisages a system of assessments developed by states, regions, or organizations certified by a national body (possibly NESAC) to assess progress toward national standards, plus the National Assessment of Educational Progress performing the same monitoring function it does now. Both the assessment system and NAEP would use performance assessments.

Although even Senator Claiborne Pell of Rhode Island has yielded as the last champion of national tests, the confusion infects the debate in the press and among educational stakeholders. It makes for strange bedfellows: In their “debate’’ on the front page of the Washington Post Education Review on April 5, Chester E. Finn Jr., an important influence on Administration policy, and Monty Neill of FairTest both repudiated tests and endorsed assessments, although presumably their contributions were intended to illustrate polar opposites.

Clearing up this verbal confusion is hardly trivial in view of what is implied by the use of these words. We are talking about two incompatible models of education. The difference between the “factory’’ model, which uses tests, and the “community of learners’’ model, which uses assessments, is like the difference between the Ptolemaic and the Copernican views of the universe. As in all paradigm shifts, changed relationships rearrange the value system.

Understanding that the debate is about different systems, not merely exchanging tests for assessments, will explain why otherwise clear thinkers seem to be taking contradictory positions when they write about national standards and assessments.

I’ve often heard people say that the form of tests doesn’t matter, that the real issue is the uses to which they are put. But the form of multiple-choice, machine-scorable tests is in itself a statement about the purposes of education, and therefore it affects the uses of testing. In the educational model that uses multiple-choice, machine-scorable tests, value is placed on ability to recognize discrete statements, to distinguish the “right’’ from the “wrong.’' “Information’’ is defined as what can be recognized easily in textbooks and memorized.

Naturally, if what is valued is passive recognition of information, then teaching is bound to mean transmission of information in pellet-like form. The act of teaching is devalued by testing, because of the vast disproportion between the hours spent acquiring facts and algorithms in strict order of difficulty and the few minutes taken up by bubbling in a few ovoids.

The purpose of this system is essentially gatekeeping, selecting and sorting students for access or denial of access to educational advantage, which, in the case of early tracking, means simply learning more interesting material than skill-and-drill. The concept of the “correct’’ answer at the heart of machine-scorable, multiple-choice testing is a metaphor for the purposes of the factory model: You’re in or you’re out.

The demands made on students by performance assess
ments are quite different: Active application of knowledge and skill to problems, as much as possible drawn from the world outside the school. The definition of information reflects that of Norbert Wiener, the father of cybernetics: “To live effectively is to live with adequate information. Thus, communication and control belong to the essence of man’s inner life, even as they belong to his life in society.’'

The consequences for teaching are profound, as teachers who understand the changes are discovering. They have every reason to complain that they are being asked to do too much without adequate compensation, since their pay scales and professional duties are based on the old model’s assumption that they are essentially textbook jockeys. Now they are expected to design assessments, participate in group-grading sessions, organize portfolios, read students’ writing no matter what subject they teach, and recast their own classroom activity from lecturing to coaching.

The purpose of this system is the development of each student’s intellectual, social, and emotional ability to function in the world. It is a noble and lofty ideal. I would be the last person to claim that any American school comes close to embodying it. But some schools (especially those belonging to the Coalition for Essential Schools), some districts, some states have begun to realize that changing from tests to assessments entails a change in purpose and attitude throughout the system.

I have drawn this contrast in models not only for its own sake, but also to make the point that using tests to measure progress in schools where change is taking place is wrong-headed. So too is the claim that accountability requires a different kind of measurement than classroom assessment. When the Arizona reformers who had rewritten their state curriculum guides looked at the state’s commercial tests for accountability, they found that only 26 percent of the curriculum was covered by the tests. From that discovery arose the Arizona Student Assessment Program, which is designed specifically to perform the two functions--accountability and modeling of desired instruction--that some think incompatible.

If students keep portfolios and engage (for example) in investigations about the water quality in their community, what information can a multiple-choice, machine-scorable test provide about their abilities and experiences? For accountability, it is only necessary to sample portfolios as Vermont has done.

Furthermore, since we all agree that fewer assessments should occupy students’ time (although many performance assessments are indistinguishable from ordinary classroom activities), an assessment which gives students individual grades could be calibrated to the national standards to report on progress at that level, and samples from the same assessment could be culled for state accountability.

Adopting a national system of standards and assessments--not a national test--implies a national endorsement of the “community of learners’’ model. Performance assessment should be seen in the context of the model: It is by no means a panacea and should not be regarded as more than a useful instrument. It is one of many factors (for example, cooperative learning, whole language, active learning, the community as classroom, the student as worker) which undermine the factory model, and provoke the asking of fundamental questions about the purposes of education. If we want to ensure that the paradigm shift happens in time to save American public education, we must focus our vision on the gestalt and not oversell individual features.

To fulfill its promise as a signal of profound change, a national system of standards and assessments must be supported by resources. Otherwise, the imposition of sophisticated assessments without the means to prepare students for them is simply national cruelty. Apart from large infusions of materials into schools which now lack photocopying machines (let alone computers), the major resource needed is the professional development of teachers.

As I mentioned above, teachers are presently squeezed between paradigms. They need to become fully professional, to work year round as other professionals do, with at least two months of time free from students in order to realize their place at the center of the development, administration, and scoring of national assessments. Questions about the quality of tasks--and therefore the quality of teaching--must be raised when there is time to consider them fully.

There is no guarantee that any of this will happen. National standards and assessments could be an empty political promise with no resources to realize them. Unfunded programs, greedy test publishers, public indifference, opportunistic politicians, business shortsightedness all threaten to abort positive change. But without it, Americans will be stuck with the educational equivalent of the flat-earth model.

A version of this article appeared in the April 29, 1992 edition of Education Week as Beyond the Verbal Confusion Over ‘Tests’