Opinion
Education Funding Commentary

Making a Silk Purse...

By Catherine E. Snow & Jacqueline Jones — April 25, 2001 8 min read
This is how a national system of annual student testing might work.

A centerpiece of President Bush’s education plan, currently under discussion in Congress, is the proposal for annual math and literacy tests for all children in grades 3-8. The benefit envisioned is the improvement of schools. The costs budgeted are not negligible. If adopted, the testing plan would cost $400 million, and would absorb many classroom hours that might otherwise be devoted to instruction. Thus, the proposal should be judged based on both its potential benefits and on its costs, and should be formulated to maximize the likely benefits.

Like previous large-scale attempts at education reform, this proposal could be a lever for improvement, or it could be an expensive, time-consuming, misdirected, and frustrating failure. For this initiative, as for others, the devil is in the details.

Under what conditions would annual testing actually generate educational improvements? In what form should the annual testing be implemented to achieve its desired effects? Let’s begin with some necessary preconditions.

The conditions ensuring the greatest benefits from annual testing include an enhanced public understanding of how tests work and what they tell us. The public needs to understand, for example, that tests, by themselves, cannot improve educational outcomes. They can lead to improvement only if they become a stimulus to change in the educational system—a basis for improved curricula, upgraded instruction, better professional development for teachers, and better distribution of resources.

While holding school districts, schools, and teachers accountable is only fair, the public also needs to understand how financial resources, student demographics, and teacher preparation affect a school’s performance. Since these contextual factors are usually outside a school’s control, it is not fair to ignore them in comparing school outcomes. Accountability systems can work if they give schools specified goals and undercut easy excuses for failure. But the results of accountability testing can also be misleading. If we simply compare scores across schools without taking into account change over time, schools that have shown great improvement can look bad in comparison with schools where children score higher but make less progress.

How, then, should President Bush’s annual test be implemented to maximize the stimulus to educational improvement and to minimize damaging effects? We propose that the two crucial features of an effective annual testing system would be a mechanism for using the test results to distribute instructional resources and a mechanism for minimizing both teaching to the test and likely misinterpretations of the results.

Use scores for improving instruction. When administered in the context of an ongoing program of classroom-based assessment and professional development, properly selected and properly interpreted tests can do the following: provide information about children’s performance levels; identify the children who need extra instructional attention; and identify the classrooms in which teachers need extra instructional support.

The public needs to understand, for example, that tests, by themselves, cannot improve educational outcomes.

But fulfilling these various functions requires selecting the appropriate tests, properly interpreting test results, and then actually using test results to inform instruction. Remarkably, the individuals responsible for making testing decisions typically know rather little about how to select, interpret, or use test results. We can hardly expect administering millions of tests to improve education if few educators know how to use the data.

It is a basic principle of test design that different functions require different tests. Our proposal violates this principle, in suggesting that an accountability test could also provide instructional information. We suggest that the annual test should be designed as a screen, to identify children who need help mastering the basics of math and reading. The information about the number of children who achieve scores above the cutoff, if appropriately filtered (see below), can reflect school effectiveness. At the same time, this information identifies children who need further, more diagnostic assessments that can be used to help teachers decide what sort of instruction to provide. We propose that it also be used as a basis for distributing professional- development resources according to need.


Test scores within a classroom should become the basis for allocation to that classroom of professional development and support to the teacher. Thus, classrooms in which a very high percentage of children receive scores below the acceptable level would receive more aid, in the form of instructional mentoring or coaching for the teacher, help in administering follow-up assessments designed to guide instruction, and resources for supplementary materials or extra classroom personnel. Classrooms in which only a small percentage of children scored below the cutoff would receive less aid. Of course, if tests are to be used to target instructional support, then they must be administered in such a way that the information from them is available immediately and early in the school year. Thus, we argue that an early-fall administration of these tests is highly desirable.

From data to information. Of course tests can provide data about how schools are doing. But such data do not constitute information about school performance if we just compare test scores across schools. We need to compare children’s test scores across time. Since in some urban areas 30 percent to 50 percent of students in a classroom in April may not have been there in September, a school’s average test score is based on the performance of many children who have hardly received instruction in that school setting. Particularly in high-transiency settings, a school’s average test scores reflect who showed up on the day of testing, not how much the school has taught its children.

Furthermore, the huge differences in test performance between urban and suburban schools often point to the experiences children bring with them as much as the experiences schools provide. Finally, the financial resources available to suburban schools are much greater than those available to the schools which typically score poorly. In using test scores to judge schools, we must disaggregate the impact of student mobility, school resources, and the extent to which children arrive already knowing what the school is trying to teach.

We suggest that any test’s use for school accountability purposes must be limited to data from children who have been in the school for at least a year, and that schools should be held more accountable for their longer-enrolled students. Doing this would require student identification procedures so that students’ school histories could be established (amazingly enough, many large urban districts do not currently have this capacity), and it would require tests designed to be comparable across grades 3-8. Comparing scores across differently designed tests is extremely difficult, if not impossible. If we wish to invest in accountability, we need to invest in designing tests that can give us the information we need to make sound decisions.

When to test? If testing is meant to improve instruction, then end-of-year tests are worthless. Results from tests administered in spring are not typically even available to teachers until the next fall, by which time the children whose test scores they receive have moved on. Furthermore, even if the test scores were available immediately, they would arrive too late in the school year for changes in instruction to have much effect.

Testing children does them no good unless it guides teachers in providing improved instruction.

An additional disadvantage of administering accountability assessments in the spring is that it creates both pressure and considerable opportunity to teach to the test. While President Bush may believe that teaching children to perform well on a math or a reading test is equivalent to teaching math or reading, this is simply not true. Tests, by their very design, reflect only a sample of what we want children to know. Teaching the sample is not equivalent to teaching the entire curriculum. While in very dysfunctional schools teaching to the test may be better than what goes on normally, in most schools it represents a narrowing of the curriculum and a waste of precious instructional time.

Making it work. For a system such as we propose to work, the annual screening tests selected would have to be relatively brief, standardized in administration, machine scoreable, and able to identify those children who need help in basic math and reading. If states are to choose their own tests, they would need a set of guidelines for selecting the screening test and guidance in prescribing appropriate follow-up assessments. A national test-review board might well be established to provide support in making these decisions.

If the tests are used to distribute professional development to those classrooms most in need, a coherent professional-development system, probably requiring increased funding, would be needed in every school district. Finally, as noted above, unique student identifiers that would make it possible to track individuals’ progress are needed for interpreting the data appropriately.


If a national system of annual testing is inevitable, experts in testing must be recruited to think creatively about how to make it serve both accountability and instructional needs. Teachers, principals, school board members, and the general public need information that can help them interpret test results appropriately.

The testing system must remain focused on upgrading instructional programs. Testing children does them no good unless it guides teachers in providing improved instruction, which in turn requires greatly enhanced professional development and support.

Annual tests should be one piece of an integrated system of ongoing classroom-based assessment and professional development, targeted where the need is greatest.


Catherine E. Snow is the Henry Lee Shattuck professor of education at Harvard University’s graduate school of education in Cambridge, Mass., and a member of the Board on Testing and Assessment. Jacqueline Jones is a visiting associate professor at the graduate school and a senior research scientist at the Educational Testing Service in Princeton, N.J.

Related Tags:

A version of this article appeared in the April 25, 2001 edition of Education Week as Making a Silk Purse...

Events

This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
Classroom Technology Webinar
Educator-Driven EdTech Design: Help Shape the Future of Classroom Technology
Join us for a collaborative workshop where you will get a live demo of GoGuardian Teacher, including seamless new integrations with Google Classroom, and participate in an interactive design exercise building a feature based on
Content provided by GoGuardian
School & District Management Live Online Discussion A Seat at the Table With Education Week: What Did We Learn About Schooling Models This Year?
After a year of living with the pandemic, what schooling models might we turn to as we look ahead to improve the student learning experience? Could year-round schooling be one of them? What about online
School & District Management Webinar What's Ahead for Hybrid Learning: Putting Best Practices in Motion
It’s safe to say hybrid learning—a mix of in-person and remote instruction that evolved quickly during the pandemic—is probably here to stay in K-12 education to some extent. That is the case even though increasing

EdWeek Top School Jobs

Teacher Jobs
Search over ten thousand teaching jobs nationwide — elementary, middle, high school and more.
View Jobs
Principal Jobs
Find hundreds of jobs for principals, assistant principals, and other school leadership roles.
View Jobs
Administrator Jobs
Over a thousand district-level jobs: superintendents, directors, more.
View Jobs
Support Staff Jobs
Search thousands of jobs, from paraprofessionals to counselors and more.
View Jobs

Read Next

Education Funding Biden Infrastructure Plan Calls for $100 Billion for School Construction, Upgrades
President Joe Biden's $2 trillion American Jobs Plan would also fund replacement of lead pipes and expand broadband internet access.
4 min read
President-elect Joe Biden speaks at The Queen Theater on Dec. 29, 2020, in Wilmington, Del.
President-elect Joe Biden speaks at The Queen Theater on Dec. 29, 2020, in Wilmington, Del.
Andrew Harnik/AP
Education Funding Miguel Cardona Releases $912 Million for Puerto Rico's Schools, Easing Trump Restrictions
Puerto Rico has regained access to hundreds of millions of dollars for education to address the fallout of COVID-19 and other needs.
2 min read
Students arrive at the Ramon Marin Sola primary school for the first time in nearly a year amid the COVID-19 pandemic as some public schools reopen in San Juan, Puerto Rico on March 10, 2021.
Students arrive at the Ramon Marin Sola primary school for the first time in nearly a year amid the COVID-19 pandemic as some public schools reopen in San Juan, Puerto Rico on March 10.
Danica Coto/AP
Education Funding School Budgets: Why They're Not As Bad As Predicted
Revenue projections are up, but districts aren't out of the woods. Seven questions answered about the evolving landscape for budgets.
11 min read
Image shows a businessman searching for new revenue in unchartered waters standing on a compass among several waves.
iStock/Getty Images Plus
Education Funding COVID-19 Aid Package Protects Funding for Students in Poverty, But Could Challenge Schools
"Maintenance of equity" mandates aim to avoid cuts by states and districts that hurt disadvantaged students more than others.
8 min read
Image of money in a puzzle shape.
simoncarter/iStock/Getty