Education Opinion

Proposal for NAEP Is ‘Recipe for Disaster’

By Matthew Martinez — March 14, 1990 9 min read

Chester E. Finn Jr.--the former assistant secretary for educational research and improvement in the U.S. Education Department who was appointed by former Secretary of Education William J. Bennett on his last day in office to chair the National Assessment Governing Board--is calling for further expansion of the National Assessment of Educational Progress to police American schooling (“The Need for Better Education Data,” Commentary, Feb. 7, 1990).

The new role proposed by Mr. Finn for NAEP would be methodologically unsound and would neither foster reform nor increase accountability. But it would devour funding for more significant research and services, and it would be politically loaded.

First, no matter how worthwhile their intentions, it is inappropriate for federal officials--or their proxies--to determine the agenda for local instruction. An acceptance of the nation’s diversity and the limits of centralized control has allowed competition and innovation for excellence in American education. In this context, NAEP has usefully spotlighted the need to strengthen our schools by providing a representative sample in an assessment that no one teaches to.

Since the Constitution and the law creating the U.S. Education Department establish that the federal government cannot control curricula, Mr. Finn proposes to drive a core curriculum indirectly through an extended NAEP. Concerned that such proposals might emerge, the House agreed to a limited, demonstration expansion of NAEP only on the condition that strong safeguards be put into pl 100-297 to protect the assessment’s established function and prevent abuses. This balanced and incremental role is now being attacked.

Second, it is not easy to make valid comparisons of achievement across 50 states and 15,000 school districts--much less across nations--with different curricula and highly diverse student bodies. Scores may change for reasons having nothing to do with program quality.

Despite the fact that family income, for example, consistently emerges as one of the strongest predictors of achievement, NAEP does not include information on this factor. As Ramsay Selden of the Council of Chief State School Officers has pointed out, while absolute standards are needed, measures must also be placed in a real-world context if they are to improve policy and instruction. And as Albert Shanker, president of the American Federation of Teachers, notes, if we focus on a single standard, we risk skewing instruction away from the children well above or below it.

Without independent, multiple measures and information on process to guide debate and policymaking, the proposed “new new” NAEP becomes a platform for loudmouths rather than a useful tool for improvement.

Third, since no test can assess everything, the rankings of schools and states necessarily bounce around according to what is in the sample of knowledge and the instrument used to measure it. The Congressional Budget Office pointed out in a 1987 study that findings on educational achievement “make a strong case against creating a single national achievement test. ... Only by comparing several tests can the analyst distinguish results that are consistent enough to provide a firm basis for policy from those that are merely idiosyncrasies of individual tests.”

As the “anomaly” in NAEP’s 1986 reading results shows, even the Educational Testing Service can run into technical difficulties. And the findings of the West Virginia advocacy group Friends for Education--indicating that most children score “above average” on nationally normed tests--highlight the complex educational, political, and market forces at work in assessment.

Fourth, we need to use testing for reform. How would the altered NAEP proposed by Mr. Finn be used? The information traditionally released by NAEP has made sense at every level of education. In this form, NAEP has been a powerful force for reform. In contrast, the past decade’s state-mandated tests have had little positive impact on education--and now some want to translate that failure to the national level.

Some propose turning NAEP into a high-stakes assessment that will directly affect funding, jobs, and curricula. It is safe to predict that scores will rise if the stakes become high enough: Under such conditions, testing begins to tell us more about curriculum “adjustment” and purchase of textbooks that look just like the test than it does about the ability of students to use what they have learned outside the classroom.

The alternative--basing policy on a national assessment without clear links to classroom practice--has other risks. A widely reported assessment of geography knowledge among students in Texas found that students “didn’t know” what nation was south of the Rio Grande. This finding says something interesting--but does it concern students’ ignorance or the unwarranted inference of the psychometricians and the press?

According to a report in Education Week, a small number of students in a rigorous California high school decided to “flunk” that state’s assessment as a protest against their having to take too many standardized tests. The school’s ranking plummeted, and red-faced administrators spent much time explaining.

Assessments of urban schools are unlikely to be free of such problems. This factor can lead to serious underestimation of what is being learned. In a nation where three out of four households have no children, this apparent documentation of failure is a political formula for undermining support for public education. Educational deficiencies must be pointed out--but in a way that assures they are real deficiencies, not statistical artifacts or ideological agendas in disguise.

Fifth, how much would a “new new” NAEP cost? The Alexander-James report that triggered the demonstration expansion of NAEP estimated that state-by-state comparisons could cost $26 million. As Mr. Finn more recently warned, an assessment providing state and local comparisons could easily cost $100 million. This figure presumably does not include more than a token move toward performance assessments: Reading essays or evaluating portfolios costs up to 300 times as much as a computer scan of multiple-choice questions. By contrast, the assessment of Chapter 1--the biggest federal program for precollegiate education--is budgeted at only $15 million.

And in addition to providing for an incremental, demonstration expansion of NAEP, pl 100-297 upgraded the responsibilities of the National Center for Education Statistics. But this year’s entire budget for the wide range of activities at the NCES is only $40.3 million--less than half what Mr. Finn proposes to spend on NAEP.

Costs for the current, limited expansion of NAEP are already far ahead of estimates. The Education Department recently “reprogrammed” $4 million to NAEP by deferring validity studies of the assessment, delaying a national assessment of adult illiteracy, and postponing work on the National Education Longitudinal Study. NAEP could easily become the black hole of education funding; spending on this program must be balanced with increases for educational services and for other research-and-development projects.

Sixth, what about the business of testing? Giving sole-source contracts for huge projects is always risky, and the education testing market is already dominated by very few players. The size and complexity of the ETS contract for NAEP means that no one can effectively monitor it and assure accountability.

As the reading “anomaly” and recent cost overruns indicate, even if all parties are well-intentioned, balance and accountability are still needed. Reinforcing industry concentration, the huge NAEP contract short-circuits the competition so vital to the scientific process in keeping researchers and policymakers honest, and so fundamental to improvements in assessment.

Seventh, any national assessment should supplement state, local, and private assessments--not displace or duplicate them. Mr. Finn, who so vigorously advocates competition in other areas of education policy, sees nothing but “a motley array of commercial tests and state assessment programs” when he surveys the field of assessment. In fact, it is the states and commercial testers that have spearheaded many innovations in the field. California, for instance, has a strong program of improved assessment under way, moving beyond the multiple-guess format for performance assessments.

There is no reason for NAEP to re-invent the wheel at great expense to taxpayers. Shared know-how makes federalism work for excellence. Despite the rhetoric from a few, we are not faced with a “top down” versus “bottom up” choice. In reality, the level and quality of data can be strengthened at all levels through effective cooperation. That is why pl 100-297 created a “cooperative education statistics” system under the NCES. NAEP should build on this national resource.

Openness and wide, continuing input are crucial if NAEP is to serve accountability and excellence. Recent actions of the National Assessment Governing Board have created concern that, for political expediency, it has used technicalities to avoid listening to good advice. State directors of several of the large-scale assessments were distressed after the board’s December meeting about its failure to consult widely; about what they perceived as the summary dismissal of the directors’ association; and about consideration of governing-board papers calling for NAEP to monitor progress toward national goals without those papers’ being circulated for comment to major educational organizations or the association of state assessment programs. This restriction of access and comment raises fears that the board and the ETS may try to confront other players with accomplished facts in assessment design and implementation--and that expansion of NAEP may be a federal grab for power.

These concerns have been widely shared from the beginning. The original NAEP was carefully circumscribed to avoid such problems. The Alexander-James report--which provided the model for the “new” NAEP--included a warning from the National Academy of Education that the assessment could not only begin to “exercise an influence on our schools that exceeds [its] scope and merit” but also evolve into a stifling national curriculum.

The study panel urged monitoring of NAEP’s influence on schooling, secure insulation from political pressures, effective checks and balances, and wide public and professional participation at all levels. It also recommended that NAEP be clearly linked to real-world performance rather than rely on a test generated by bell-curve assumptions and report by a statistical “factor” where no one knows what the factor scores really mean. And the panel warned that NAEP’s costs must not be allowed to undermine other vital data collection or preempt innovation and competition in testing.

The narrowly focused assessment hierarchy being proposed is a recipe for disaster. If the mutation of NAEP proposed by Mr. Finn ever comes to pass, who knows how many valuable programs will be killed and how many worthless projects will be funded.

The present, more limited assessment is useful precisely because it has done what no other source does: It provides a representative sample of students, and no one teaches to it. It makes federalism work efficiently by sharing expertise; it does not duplicate other efforts.

Vigorous action by the education community is critical in determining what role--if any--NAEP can play in building excellence in education for all Americans.

A version of this article appeared in the March 14, 1990 edition of Education Week as Proposal for NAEP Is ‘Recipe for Disaster’