Education Funding Opinion

Making a Silk Purse...

By Catherine E. Snow & Jacqueline Jones — April 25, 2001 8 min read
  • Save to favorites
  • Print
This is how a national system of annual student testing might work.

A centerpiece of President Bush’s education plan, currently under discussion in Congress, is the proposal for annual math and literacy tests for all children in grades 3-8. The benefit envisioned is the improvement of schools. The costs budgeted are not negligible. If adopted, the testing plan would cost $400 million, and would absorb many classroom hours that might otherwise be devoted to instruction. Thus, the proposal should be judged based on both its potential benefits and on its costs, and should be formulated to maximize the likely benefits.

Like previous large-scale attempts at education reform, this proposal could be a lever for improvement, or it could be an expensive, time-consuming, misdirected, and frustrating failure. For this initiative, as for others, the devil is in the details.

Under what conditions would annual testing actually generate educational improvements? In what form should the annual testing be implemented to achieve its desired effects? Let’s begin with some necessary preconditions.

The conditions ensuring the greatest benefits from annual testing include an enhanced public understanding of how tests work and what they tell us. The public needs to understand, for example, that tests, by themselves, cannot improve educational outcomes. They can lead to improvement only if they become a stimulus to change in the educational system—a basis for improved curricula, upgraded instruction, better professional development for teachers, and better distribution of resources.

While holding school districts, schools, and teachers accountable is only fair, the public also needs to understand how financial resources, student demographics, and teacher preparation affect a school’s performance. Since these contextual factors are usually outside a school’s control, it is not fair to ignore them in comparing school outcomes. Accountability systems can work if they give schools specified goals and undercut easy excuses for failure. But the results of accountability testing can also be misleading. If we simply compare scores across schools without taking into account change over time, schools that have shown great improvement can look bad in comparison with schools where children score higher but make less progress.

How, then, should President Bush’s annual test be implemented to maximize the stimulus to educational improvement and to minimize damaging effects? We propose that the two crucial features of an effective annual testing system would be a mechanism for using the test results to distribute instructional resources and a mechanism for minimizing both teaching to the test and likely misinterpretations of the results.

Use scores for improving instruction. When administered in the context of an ongoing program of classroom-based assessment and professional development, properly selected and properly interpreted tests can do the following: provide information about children’s performance levels; identify the children who need extra instructional attention; and identify the classrooms in which teachers need extra instructional support.

The public needs to understand, for example, that tests, by themselves, cannot improve educational outcomes.

But fulfilling these various functions requires selecting the appropriate tests, properly interpreting test results, and then actually using test results to inform instruction. Remarkably, the individuals responsible for making testing decisions typically know rather little about how to select, interpret, or use test results. We can hardly expect administering millions of tests to improve education if few educators know how to use the data.

It is a basic principle of test design that different functions require different tests. Our proposal violates this principle, in suggesting that an accountability test could also provide instructional information. We suggest that the annual test should be designed as a screen, to identify children who need help mastering the basics of math and reading. The information about the number of children who achieve scores above the cutoff, if appropriately filtered (see below), can reflect school effectiveness. At the same time, this information identifies children who need further, more diagnostic assessments that can be used to help teachers decide what sort of instruction to provide. We propose that it also be used as a basis for distributing professional- development resources according to need.

Test scores within a classroom should become the basis for allocation to that classroom of professional development and support to the teacher. Thus, classrooms in which a very high percentage of children receive scores below the acceptable level would receive more aid, in the form of instructional mentoring or coaching for the teacher, help in administering follow-up assessments designed to guide instruction, and resources for supplementary materials or extra classroom personnel. Classrooms in which only a small percentage of children scored below the cutoff would receive less aid. Of course, if tests are to be used to target instructional support, then they must be administered in such a way that the information from them is available immediately and early in the school year. Thus, we argue that an early-fall administration of these tests is highly desirable.

From data to information. Of course tests can provide data about how schools are doing. But such data do not constitute information about school performance if we just compare test scores across schools. We need to compare children’s test scores across time. Since in some urban areas 30 percent to 50 percent of students in a classroom in April may not have been there in September, a school’s average test score is based on the performance of many children who have hardly received instruction in that school setting. Particularly in high-transiency settings, a school’s average test scores reflect who showed up on the day of testing, not how much the school has taught its children.

Furthermore, the huge differences in test performance between urban and suburban schools often point to the experiences children bring with them as much as the experiences schools provide. Finally, the financial resources available to suburban schools are much greater than those available to the schools which typically score poorly. In using test scores to judge schools, we must disaggregate the impact of student mobility, school resources, and the extent to which children arrive already knowing what the school is trying to teach.

We suggest that any test’s use for school accountability purposes must be limited to data from children who have been in the school for at least a year, and that schools should be held more accountable for their longer-enrolled students. Doing this would require student identification procedures so that students’ school histories could be established (amazingly enough, many large urban districts do not currently have this capacity), and it would require tests designed to be comparable across grades 3-8. Comparing scores across differently designed tests is extremely difficult, if not impossible. If we wish to invest in accountability, we need to invest in designing tests that can give us the information we need to make sound decisions.

When to test? If testing is meant to improve instruction, then end-of-year tests are worthless. Results from tests administered in spring are not typically even available to teachers until the next fall, by which time the children whose test scores they receive have moved on. Furthermore, even if the test scores were available immediately, they would arrive too late in the school year for changes in instruction to have much effect.

Testing children does them no good unless it guides teachers in providing improved instruction.

An additional disadvantage of administering accountability assessments in the spring is that it creates both pressure and considerable opportunity to teach to the test. While President Bush may believe that teaching children to perform well on a math or a reading test is equivalent to teaching math or reading, this is simply not true. Tests, by their very design, reflect only a sample of what we want children to know. Teaching the sample is not equivalent to teaching the entire curriculum. While in very dysfunctional schools teaching to the test may be better than what goes on normally, in most schools it represents a narrowing of the curriculum and a waste of precious instructional time.

Making it work. For a system such as we propose to work, the annual screening tests selected would have to be relatively brief, standardized in administration, machine scoreable, and able to identify those children who need help in basic math and reading. If states are to choose their own tests, they would need a set of guidelines for selecting the screening test and guidance in prescribing appropriate follow-up assessments. A national test-review board might well be established to provide support in making these decisions.

If the tests are used to distribute professional development to those classrooms most in need, a coherent professional-development system, probably requiring increased funding, would be needed in every school district. Finally, as noted above, unique student identifiers that would make it possible to track individuals’ progress are needed for interpreting the data appropriately.

If a national system of annual testing is inevitable, experts in testing must be recruited to think creatively about how to make it serve both accountability and instructional needs. Teachers, principals, school board members, and the general public need information that can help them interpret test results appropriately.

The testing system must remain focused on upgrading instructional programs. Testing children does them no good unless it guides teachers in providing improved instruction, which in turn requires greatly enhanced professional development and support.

Annual tests should be one piece of an integrated system of ongoing classroom-based assessment and professional development, targeted where the need is greatest.

Catherine E. Snow is the Henry Lee Shattuck professor of education at Harvard University’s graduate school of education in Cambridge, Mass., and a member of the Board on Testing and Assessment. Jacqueline Jones is a visiting associate professor at the graduate school and a senior research scientist at the Educational Testing Service in Princeton, N.J.

Related Tags:

A version of this article appeared in the April 25, 2001 edition of Education Week as Making a Silk Purse...


This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Assessment Webinar
The State of Assessment in K-12 Education
What is the impact of assessment on K-12 education? What does that mean for administrators, teachers and most importantly—students?
Content provided by Instructure
Jobs January 2022 Virtual Career Fair for Teachers and K-12 Staff
Find teaching jobs and other jobs in K-12 education at the EdWeek Top School Jobs virtual career fair.
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Reading & Literacy Webinar
Proven Strategies to Improve Reading Scores
In this webinar, education and reading expert Stacy Hurst will provide a look at some of the biggest issues facing curriculum coordinators, administrators, and teachers working in reading education today. You will: Learn how schools
Content provided by Reading Horizons

EdWeek Top School Jobs

Teacher Jobs
Search over ten thousand teaching jobs nationwide — elementary, middle, high school and more.
View Jobs
Principal Jobs
Find hundreds of jobs for principals, assistant principals, and other school leadership roles.
View Jobs
Administrator Jobs
Over a thousand district-level jobs: superintendents, directors, more.
View Jobs
Support Staff Jobs
Search thousands of jobs, from paraprofessionals to counselors and more.
View Jobs

Read Next

Education Funding Citing Supply Chain Issues, Inflation, USDA Boosts Funding for School Meals
The U.S. Department of Agriculture will boost reimbursements for school lunches as districts face COVID-related challenges.
2 min read
Second grader Amado Soto eats his lunch socially distanced from his fellow students in the cafeteria at Perez Elementary School during the coronavirus pandemic on Dec. 3, 2020, in Brownsville, Texas.
Second grader Amado Soto eats lunch socially distanced from his fellow students at Perez Elementary School in December.
Denise Cathey/The Brownsville Herald via AP
Education Funding More Federal Aid Is Coming for Schools Struggling to Buy Food Due to Supply-Chain Crisis
The $1.5 billion USDA infusion is the second in several months to help schools purchase food amid shortages and price increases.
2 min read
Stacked Red Cafeteria trays in a nearly empty lunch room.
iStock/Getty Images Plus
Education Funding School Districts Are Starting to Spend COVID Relief Funds. The Hard Part Is Deciding How
A new database shows districts' spending priorities for more than $122 billion in federal aid are all over the place.
8 min read
Educators delivering money.
iStock/Getty Images Plus
Education Funding The Political Spotlight on Schools' COVID Relief Money Isn't Going Away
Politicians and researchers are among those scrutinizing the use and oversight of billions in pandemic education aid.
7 min read
Business man with brief case looking under a giant size bill (money).
iStock/Getty Images Plus