Published in Print: August 2, 2000


None of the Above

It is no secret to anyone who has picked up a newspaper in the past six years that public education has undergone a literal revolution. A wave of standards-based reforms in elementary and secondary education has swept the country, with many promising state initiatives and with strong leadership and support from the federal government. A central component of many of the state-based reforms has been the institution (or the planned development) of a series of high-stakes tests for students—by definition, designed to measure learning either as a condition of grade-to-grade promotion or as a condition of high school graduation.

All too often, high-stakes tests have come to be viewed as the ultimate cure-all.

One particularly striking feature of the public-policy discussion about educational reforms contemplated or under way is the frequently singular emphasis on students' scores on these high-stakes tests. Often, as if mere testing holds the answers to all of the problems that plague our schools, this focus on the testing programs or student test scores has not been accompanied by commensurate attention to related accountability issues, such as teacher professional development, or ways to ensure that schools receive the necessary resources to meet the needs of a rapidly growing and increasingly diverse student population. In short, all too often, tests have come to be viewed as the ultimate cure-all, rather than the benchmark to help gauge the performance of students and raise questions about what additional steps should be taken to improve their classroom learning and school experience.

Very recently, headlines advising of the "backlash" against standards reforms have surfaced, consistently linked to concerns about the high-stakes-testing practices associated with the standards movement. Some educators, analysts, and advocates have concluded that high-stakes-testing practices can only perpetuate the discrimination that, at times and in certain places, has plagued our public schools, thus doing more harm than good to the very students who need the most help. Some have gone so far as to conclude that the testing practices associated with standards reform are "dangerous." And stories like the one where high school students in one school deliberately flunked a state exam in protest of the "testing frenzy" or, in another, refused to take state-mandated tests at all, have served to highlight the concerns of some about the use of tests for individual student accountability—questioning the very notion that large-scale tests are appropriate measures of student learning for any purpose at any time.

Many educators tend to express themselves as 'for' or 'against' high-stakes testing. Unfortunately, the issue is not so simple.

Thus, much of the dialogue and debate surrounding standards reforms has centered on testing. And within that context, the discussion has tended to take place on the extremes of the spectrum, embracing an all-or-nothing approach to the issue. In short, many in the education world tend to express themselves as "for" or "against" high-stakes testing. Unfortunately, much of this public debate masks the fact that the issue is not so simple. Moreover, the prospect of a backlash against standards reforms because of concerns about the use of high-stakes tests should provide the wake-up call to legislators and policymakers who have been focused on student accountability in the name of testing and little else. Indeed, a central issue to be addressed for those of us who passionately advocate the positive features of the standards reforms is this: What are the steps we can take that will preserve the educational values associated with high-stakes testing (as one facet of standards reform) and, at the same time, will stem the tide of opposition to standards reforms that potentially threatens the accomplishments of the past decade?

There is no simple answer to this question. But an analysis of practices from state to state, and the related, indisputable evidence about what constitutes good testing practices suggest that there is at least one meaningful step that can—and should—be taken. The problem is not that at least 26 states have or plan to have by 2003 testing programs in place that will condition a student's high school graduation on test results. The problem is the way that high school graduation decisions are to be made under most of these legislative proposals. With few exceptions, states are saying that in order for a student to graduate from high school, he or she must pass a high school exit examination, and that his or her otherwise consistent, excellent performance in all required coursework will not count, even in circumstances where, for instance, the student can achieve no test score higher than a 69, where a 70 is designated as the passing score.

Contrast this legislative judgment to the principles of sound test use recently re-codified by the American Educational Research Association, the American Psychological Association, and the National Council on Measurement in Education. In their Standards for Educational and Psychological Testing, these leading bodies reaffirmed the common-sense tenet that no high-stakes decision affecting students during their elementary and secondary education should be based on a single test score.

Similarly, in January 1999, the National Research Council published a book entitled High Stakes: Testing for Tracking, Promotion, and Graduation. In that document, the NRC concluded that allowing students with low test scores on high school exit exams to earn diplomas if they meet or exceed other academic requirements, such as excellent grades in required coursework, would be most consistent with these standards.

Tests should never be used as a sole criterion when making decisions that have significant consequences.

Taken together, the work of these leading national organizations affirmed a simple, yet little-known principle of good testing practice: Tests are useful but not perfect measures of student learning. Tests should never be used as a sole criterion when making decisions that have significant consequences for students. Rather, test scores (on tests that have been validated for the particular use in question) should be considered in conjunction with other educationally relevant factors (such as grades) when making life-defining decisions affecting students.

The common-sense principle that multiple factors can help ensure better educational decisions affecting students is a widely shared view. In fact, all leading test publishers observe that, if appropriately validated for the particular use, tests are valuable instruments to use when making important decisions affecting a student's education. But they confirm that test scores are not perfect measures and should not be the sole basis for making such decisions.

The College Board, for example, says that test uses which "should be avoided" include "using test scores as the sole basis for important decisions affecting the lives of individuals, when other information of equal or greater relevance and the resources for using such information are available" (Guidelines on the Uses of College Board Test Scores and Related Data, 1988).

It is unfortunate when the questions about how best to test or when to test are subsumed to the more extreme (and less educationally relevant) question of whether we should test students at all. As a broad proposition, the value tests provide is indisputable: Tests serve as benchmarks of school performance and enable the tracking of improvements, as well as the identification of targets for work, over time; they can serve as useful bases for comparing similarly situated schools and districts to assist in program evaluation. And they can serve as performance measures by which to gauge administrator, teacher, and student performance in multiple ways.

At the same time, the use of scores on any one test instrument as a basis for making a high-stakes decision affecting a student denotes a quality of testing perfection that simply does not exist. You don't have to travel far in the testing and education communities to hear the same term used with regard to this kind of practice: Many will tell you that there is a "mythology" about tests, and that we should take steps to debunk the misimpressions that can lead to bad testing practices that have real consequences for students.

Using test scores to make high-stakes decisions denotes a quality of testing perfection that does not exist.

We should work, then, to promote a better understanding of the principles which, if followed, can lead to educationally sound uses of high-stakes tests. The public discussion about educational testing practices will be much richer, and ultimately more beneficial to our students, when its focus is on the issue of how to use the right test in the right way at the right time in the right context.

So, when you next hear the question "Are you in favor or against student testing?" you may respond, "None of the above."

Arthur L. Coleman is an attorney at Nixon Peabody llp in Washington, where he provides legal, policy, and strategic-planning services to states and schools on a range of standards-reform issues in education. He served as a deputy assistant secretary in the U.S. Department of Education's office for civil rights until January of this year.

Vol. 19, Issue 43, Pages 42,45

