By All Measures: The ‘Most Promising Way’ of Getting the Education We Want

By Lauren B. Resnick — June 17, 1992 11 min read

Will there be new costs, a need for new resources for this? Yes. The best estimate would be four, six, eight weeks of paid professional time each year for teachers--that is, not with children, but rather developing themselves professionally, working on curriculum, working on the kind of preparation they need. In China and Japan, teachers spend up to half of their paid professional time not facing children, but developing their lessons, developing their material, and developing themselves.

MS. RESNICK: It’s tempting to just take up Ted Sizer’s questions one by one, because they are so important and he poses them with so much clarity. I can’t do them all in the limited time I have, but I will address some of them.

Is changing tests, changing assessment, really the best lever for reform we can imagine? I want to suggest why it is, why a serious effort to change the assessment system under which we now operate might be our most promising way of getting to the kinds of education we all want.

The main reason is that we already have, in effect, a national testing system, based on the kinds of grade-level data that come out once or twice a year in every school district around the country that show whether the system as a whole, the individual school, and sometimes the individual class in a school, are performing up to grade level, above it, or below it. That constitutes our national testing system. That’s the heart of it. We cannot pretend that we’re starting with nothing, and now we’re going to move to some new system with all of its inherent risks.

There’s something else much more interesting, much better, called the National Assessment of Educational Progress, which is hardly in the news, hardly in the play. It happens only once every several years, and, at least up to now, nothing much depends on what the results are. We use it to monitor how the system as a whole is going. Many people don’t notice those scores; many teachers have no idea what NAEP is.

The testing system we have that really makes a difference are tests such as the Iowa Test of Basic Skills and the California Achievement Test, the Metropolitan Achievement Tests, and a couple more. These are our national testing system. Most of our kids take those tests at least once a year, sometimes twice. And if they’re in any kind of special program, several times, very probably. And, what’s more, subtly or overtly, teachers, principals, and school superintendents are held accountable for how those test scores look.

What’s wrong with the present testing system? Why is it worth doing the New Standards Project and taking all the risks inherent in a new national system of assessment? Why don’t we just stick with what we’ve got?

Well, what’s wrong is that those tests provide an absolutely terribly model of the kind of learning we want.

They are collections of items that you have to answer at the rate of, at the slowest, one per minute, in order to do well on the test. The passages that a student traditionally reads in this respect tend to be 500 words; that’s two typewritten pages. That’s what we consider the complexity of text that is the most complicated that our 8th to 11th graders should be reading, as measured by our de facto national testing system.

The mathematics items are even worse. And we measured writing, until very recently, by one’s ability to correct spelling, to check spelling errors, and so on. And to do that at the rate of about two items per minute. That’s what those tests look like.

No. 1: We’ve got a terrible model of what knowledge is, and what we care about, built into those tests: Collections of decontextualized and decomposed bits of knowledge that do not add up to competent thinking, to knowing a body of history or science or whatever we might care about. We have the assembly-line version of knowledge: Break it into little bits so any nincompoop can fill in the bubbles.

No. 2: We have tests that have built into them the theory that how far you can go is all in the genes and maybe in the first three to five years of family life. The tests spread kids out on a curve; the items, in the end, are selected to enhance that spreading out. All of the items on a math test do have something to do with arithmetic or math, but the way the tests are built, fundamentally, is by setting up a pool of a hundred, maybe a couple hundred, items, and trying them out on kids. From that pool, items are selected that will spread the kids out the most, because that’s what will enable them to be assigned to certain percentile ranks most reliably.

In effect, all the items that everybody can do, and all the items that almost nobody can do, are thrown out. But those are the two that are the most interesting in many ways. The worst thing you can imagine is throwing out those parts of the test that show that you’ve succeeded, and that show kids and teachers that they can succeed.

You also don’t want to throw out the ones that are making so much trouble that hardly anybody can do them, because, in a way, those are setting the stars to reach for, the targets that we’d like to get to, even if we can’t do it yet. Or, those we’d like some people to get to, even if not quite everybody.

This is all elegant technology, and it’s done very honorably by the testing companies.

But the message is profoundly debilitating. If you start out in the 70th or 80th percentile, either as an individual student or as a school that’s blessed with students who come in able to perform that way, you sit back and say, “Well, I don’t really have to work; there’s nothing much more for me to reach for, I’m already in the top 20 percent or 30 percent.’' And if you start out in the bottom, let’s say the 30th percentile or the 20th, the basic message, once you’ve come to understand it, is that you can never get out. Because the only way you can rise from the 30th percentile to the 70th, say, in real terms, is for everybody else to wait around for you to catch up, and even let you surpass them. Well, that isn’t going to happen, nor would we want it to.

It isn’t too likely that a child or a group of children who start in the bottom quarter are going to end up in the top quarter in a comparison. But they sure are likely, if they work hard and have a good curriculum and have teachers who are empowered and thoughtful and working hard, to learn a lot. And that’s what we want to be able to show them. And that’s what today’s national testing system makes impossible.

So, for those two reasons, this present national testing system has to go.

If I thought I could get rid of it by waving a wand, I would be working in ways other than on assessment as a lever for change--investing heavily and directly in staff development, curriculum, and the things surrounding both of those.

But because I think there is zero political chance that this country, any time soon, will give up its current national testing system without having a replacement for it, the only way to get going on what we need to do is to attack directly what is one of the most powerful dampers to the kind of change we need. Talk to teachers who have caught on to the idea that the kind of teaching required in a “thinking curriculum’’ is possible, and then ask them, “What is the biggest barrier to it?’' Their answer every time is, “Those standardized tests are coming, and I’m afraid my kids won’t pass them.’' It takes courage of an extraordinary kind to say, “I’m just going to take the risk and forget about the tests.’' The pressures to drill to the test are overwhelming, and they are overwhelming mainly in the schools that serve our poorest children.

So, I now reach the equity part of the argument.

In our more privileged schools, first of all, the kids come in doing pretty well. They don’t need to really worry too much about those test scores. There’s some worry, but it’s not overpowering. And so there’s a certain liberation.

But in addition, in those communities there’s an implicit knowledge about what’s really worth knowing.

In the inner cities, however, where the accountability burden is enormous, and where there is not an implicit knowledge spread through the community of what the alternatives are that really matter, the drill to those bubble tests can be overwhelming and stifling. There’s a double standard built into our current national testing system that holds poor kids, children of color, language-minority children, to a standard of what it is that’s even worth learning that is different from the implicitly held standards of the more-favored schools.

All of this constitutes what I consider a totally unacceptable risk to take: Namely, the risk of continuing our current system.

I am willing to take the risks, some of which Ted alluded to, of trying to move toward another system, because I think staying with the current one is more risky. We already know what’s happening as a result of it. So, it’s worth taking those risks to forge something better.

As Marc said, this isn’t about exams; it’s about improving education and learning for all kids. The way to make that happen is not by asking how do we build a better test and then hoping people will somehow get kids to do better on it. The way to make this happen is to ask: “How can we embed an internalization of these new goals and criteria and standards for learning throughout the education system?’' And that means primarily among teachers. We think the only way to do that is by having teachers--and maybe principals and maybe curriculum supervisors, but largely teachers--be the primary agents of the whole process.

The New Standards Project is committed to the position that children will not be taking assessments unless their teachers have participated in the building and scoring of them. And so, in a process that has begun in the past year and will proceed in a kind of pyramid or training-of-trainers model, we are working with teachers in the 17 states and half a dozen school districts that are participating in the project.

They develop the tasks in an interaction with people from the national disciplinary organizations like the National Council of Teachers of English, the National Council of Teachers of Mathematics, and so on.

We are not test developers, and we are not ourselves standards developers; we are doing this with the nation at large. The teachers develop tasks, and they pass them back and forth between the subject-matter and curriculum groups, and eventually come to the kind of pilot that went on a few weeks ago with 12,000 children around the country participating.

The teachers, in so doing, are engaged in reviewing and even creating content standards. They do the scoring. These are matters for human judgment; to score the tasks Marc talked about, you have to understand what the criteria are. And our teachers participate in developing those criteria.

The discussion surrounding that scoring is a massive piece of professional development, and we intend to study that process. We know, in fact, that virtually any kind of technical scoring can be made adequately reliable. The question is: Which kinds of scoring activities flow back into teaching with the most power? That’s the question that matters. We can get adequate test reliability using any one of the technical methods that are around.

So, all of these things are part of professional development. Teachers are doing just what Ted said; they’re looking at student work. There’s a massive reliance on examples as the only serious way to set standards and build an understanding of them. And they will participate with other teachers.

All of this impacts on how one even wants to think about the cost question. We have commissioned some work by economists to study actual costs, and we’ll have something by the end of the summer. What’s really important is the conceptual frame, though. In most studies of the costs of testing, testing is an add-on. The whole view we have is that assessment has to be part of the total system of professional activity by educators.

What we have to ask is: What does it take, in terms of cost, to have the kinds of professional development we need, to have the kinds of networking of teachers who support each other and develop standards and criteria? What kind of time off? This is probably the single greatest cost for this kind of a system; what kind of time from teachers will it take? And how much of that time is needed to be a good teacher? And we ask then what additional cost is needed to provide a test score for the public. We believe it will be very, very little over and above the cost of the good education that we’re aiming for.

Will there be new costs, a need for new resources for this? Yes. The best estimate would be four, six, eight weeks of paid professional time each year for teachers--that is, not with children, but rather developing themselves professionally, working on curriculum, working on the kind of preparation they need. In China and Japan, teachers spend up to half of their paid professional time not facing children, but developing their lessons, developing their material, and developing themselves.

So, the question about the cost of assessment simply cannot be asked in the traditional terms about what does it cost to administer and score a test. It’s the wrong way to view it.

A version of this article appeared in the June 17, 1992 edition of Education Week as By All Measures: The ‘Most Promising Way’ of Getting the Education We Want