New User Friendly Tests Provide More Information

Psychology's greatest contribution to science has been the notion of measuring mental attributes: the testing for achievement, ability, and aptitude by standardized examination. Curiously, however, while the influence of psychology on contemporary thought has grown enormously in recent decades, the standardized test has become the center of one of society's most strident controversies. Why do we find these tests simultaneously troublesome and fascinating? Why can we not agree on their uses and value? Why can we not either accept them for what they are or simply discard them onto the waste heap of once-tried but unsuccessful ideas? The complexity of testing theory and of the instruments themselves at times results in misunderstanding of the meaning of these tests. Nevertheless, the number of users continues to rise, and test writers work steadily at the refinement of their creations and at the discovery of new uses for them.

At first blush the issue seems to be straightforward. Standardized tests appear to have no intrinsic meaning at all. They do not absolutely reveal an examinee's ability, nor do they delineate a student's progress along the lines of a consistent learning theory. They provide no new information about why a student demonstrates a particular strength or weakness, and they say nothing about how he acquires certain skills or facts or intellectual processes. Whole facets of school experience--for example, original and imaginative thinking--cannot even be addressed by such tests merely because of the paper-and-pencil format.

This situation presents a perplexing dilemma to educators. They know that if they reject tests they must make judgments about pupils by less methodological means, which they know equally well are less reliable and even more subject to bias than the standardized tests. On the other hand, if educators embrace the use of standardized tests, they realize they risk trivializing into contrived responses on a fill-in-the-bubble answer sheet their own important work in teaching young minds to think for themselves.

Further, by sorting people into ranked categories, the use of standardized tests runs counter to a shift in values in contemporary society. In place of the model of a meritocracy, wherein individuals are rewarded on the basis of comparative success, the ideal has become a strictly egalitarian society in which certain outcomes are guaranteed by the prescription of quotas and other such means. Because it supports the notion of individual merit, testing troubles many people and, for some, poses a conflict of values between a genuine desire to see everyone succeed and an intuitive trust in the worth of personal ability and achievement.

Aggravating the confusion, the television and print media like to report stories of the abuse of tests and test results by uninformed or ill-willed persons. Largely political organizations, such as the National Education Association, the National Association for the Advancement of Colored People, and Ralph Nader's group Public Citizen, deliberately set themselves up as defenders of the oppressed and portray the tests as victimizers.

Despite recitations of the shortcomings of standardized tests, such testing remains a popular procedure. More than 25 million people annually share the experience of taking a standardized examination of one type or another. At the same time, ever-growing numbers of school boards, accrediting agencies, governmental institutions, and other such groups are turning to tests as a means of satisfying a particular need. If all these millions of test takers and the thousands of agencies who use tests are victims, by whom or what are they victimized?

We may approach in several ways a response to the critics of standardized tests. First, we must realize that the theoretical principles involved in making mental measurements, as well as the statistics necessary for putting a theory into practice, are extraordinarily complex. Apart from specialists in the field, not many students or educators understand them. Long gone in test development are the days in which only a set of questions and an answer key were needed. The varied issues to be considered today in test construction include validity and reliability indicators, bias-detection strategies, and statistics with involved algorithms for scaling, scoring, and normingtests. Perhaps this is one reason why the detractors of tests are forced to rely on mostly undocumented anecdotes. Examples usually cite an individual here or there who was harmed in some way by the trial of undergoing an examination, or a case where a6student was denied admission to some program and then by self-study went on to become a leader in the field. Though such stories may be true, they are misinformed as evidence for putting tests on trial.

Second, we must note that standardized tests remain in use in schools because theoverwhelming majority of educators who may have a use for them want them. Revile tests though we may, we find them indispensable. Other tools for assessing students lack the reliability and efficiency of standardized tests. Testing also provides information we cannot acquire in any other way. Teachers believe, for instance, that a student's test score reveals something about how that student might perform in a more generalized context. To be sure, the ample data correlating the scores on many of the tests with performance on related criteria warrant such confidence. But the point is that, if people did not believe in them, the tests would simply cease to be used at all. In fact, however, educators demand greater numbers of tests each year.

Furthermore, textbook publishers and other developers of curricula report an increased reliance on tests as significant determinants of subject-matter content. At the same time, growing numbers of teachers as well as test developers share the view that the test instruments themselves have instructional value. The strictly normative component of tests--the comparative ranking of an examinee's results against a representative sample of peers'--is being complemented and augmented by more sophisticated interpretations. By providing truly useful information, the interpreting of tests for their instructional utility makes them more meaningful to examinees. Like their computer counterparts, which at first appeared threatening but later were made more approachable and hence useful, the modern tests are "user friendly."

We might, for example, consider this simple mathematics item: "What is 32 percent of 100?" This problem could easily be mastered by most students at the end of 5th grade. The following revision of the same item, however, requires an entirely different thought process: "What is 100 percent of 32?" With its new structure, the item is no longer a simple calculation problem but rather a problem in conceptualizing the meaning of percent. When these items were tried with 6th graders of average ability, about 70 percent answered the first item correctly, but only about 35 percent responded correctly to the rewritten item. The information provided to teachers by a student's response to the conceptualization item is much more useful--"friendly"--than that given by the computation item.

Slight though the difference between the two mathematics items might appear, it is significant. Not only does it display a subtle but important distinction in the thinking skill required of the examinee, but also, in providing more useful information, it demonstrates the increased sensitivity of the writers of tests to the needs of teachers and other test users. This instance reflects, too, a realization by users of some of the ways tests can help--rather than merely sort--examinees.

Such processes of reworking and recognition will ensure the continued improvement of standardized tests. Given the widespread use of tests, we shall all be the poorer without improvements; with them, we may gain fresh confidence in testing.

