Opinion
Standards & Accountability Opinion

Fixing the Race

By Donald B. Gratz — June 07, 2000 8 min read
  • Save to favorites
  • Print
Is high-stakes testing like a race without a starting line?

Imagine the season’s final high school cross-country meet, run by the state Office of Racing. Officials pledge to demonstrate how each runner is doing, and which runners, teams, and coaches are the best. Judges crowd the finish line with cameras, computers, stethoscopes, and various diagnostic tools. As runners cross the line, not only their times, but also their pulses, breathing rates, perspiration levels, muscle oxidation, and a host of variables are checked and computed. This is the new world of racing, officials say. We will know how everyone is doing, who is the best, and why.

Now suppose, despite all this measuring, that the race has no official starting line. Some runners run 10 miles, others five, still others two. Some coaches and parents complain that the race is not fair, but the Office of Racing is not deterred. This is an absolute measure, officials say, not a relative one. When runners cross the finish line is what matters, not where they started. Sure, runners who start farther back have farther to go, but hey, that’s life. After all, this race is only one tool of many for assessing running ability. Yes, it will determine future access to athletics for many runners. And yes, average team results may eventually influence the coaches’ jobs. And, of course, teams may be reconstituted if this year’s team performs worse than last year’s. But despite these important-sounding consequences, this race is just one indicator of athletic performance. Now, everyone stop whining and try harder.

The idea of a race with no starting line is absurd, of course. In racing, we understand that the distance traveled is critically important in determining the outcome. In fact, the key factor is each runner’s speed, or rate of progress. To know this, we need to know when and where each runner started. Only then, we understand, can we compare them fairly.

It is also true, in racing, that while we track team scores, we know that they are an aggregate of individual performance. If we compare last year’s team to this year’s, we recognize that the different outcomes—whether better or worse—are significantly affected by having different members on the team. We understand that different runners naturally get different results, even with the same coach. Do we think that effort, practice, and coaching quality make a difference? Of course. Do we think these factors make up for having different runners and different starting lines? Of course not. The race described above may provide information on runners’ physical conditions, but it does not tell which runner is fastest or has the greatest endurance because it doesn’t measure how far they have run. For the same reason, it doesn’t show which team is best or which coach is most effective.

Those who follow education will recognize this analogy as representing many of the new high-stakes tests being introduced into public schools. The analogy is not perfect, I admit, but it illustrates how many tests work. I will use the Massachusetts Comprehensive Assessment System tests for my example, but many state testing programs have similar flaws.

MCAS is given in the 4th, 8th, and 10th grades, with additional tests on different topics planned for other grades. MCAS is a long test with open-ended questions—up to 20 hours over several days—so it provides a lot of information on students, like the race above. While some states use normed national tests that measure students against other students, MCAS is supposed to measure students against the state’s new learning standards.

In theory, educators often prefer such tests, because they provide information on where students are doing well and where they need help. For this reason, districts across the country have developed complex descriptions of achievement in areas such as writing and math, called rubrics. A child’s work is regularly judged according to these rubrics, which provide detailed examples of student work at each level and for each grade. Such rubrics, which apparently inspired MCAS, allow teachers to chart student progress from month to month and year to year, and are an excellent guide to each student’s strengths, weaknesses, and needs.

So far, so good.

MCAS provides no baseline data for each student, so it cannot measure the student’s rate of progress.

But MCAS is not and cannot be used regularly with the same students. It provides no baseline data for each student (the equivalent of a starting line), so it cannot measure the student’s rate of progress. Rather, the baseline data that state officials cite are last year’s 4th or 8th grade scores compared to this year’s 4th or 8th graders—the same schools, but different students.

Thus, MCAS bases its judgment of improvement or decline on the scores of classes composed of entirely different members. It doesn’t know where the children in a given class, or the class as a whole, started the year. Consequently, it may indicate that 4th grade reading in a school has declined, even if this year’s 4th graders have improved over where they were last year as 3rd graders. What the state knows is that this year’s 4th graders scored lower than last year’s. But that isn’t what gets reported. The state says, and most people believe, that school performance has declined.

This problem is magnified when different schools are compared. Because MCAS doesn’t tell us how far students have come or how fast they have progressed, we can’t make any judgment on the quality of a school or the capacity of a student. We can’t tell whether students at one school started behind their peers at other schools or are being taught less. A student who has made enormous strides may still score poorly if he started behind his peers. Similarly, a teacher who consistently teaches two years of material in a year may be judged less competent than one who teaches only a half-year of material, simply because the second teacher’s students started at a higher academic level (closer to the finish line).

Because MCAS describes a student’s status, it may be useful to teachers. But because it does not measure the student’s progress, it is a poor indicator of school or teacher effectiveness, or of student capacity. To assess these, we need to calculate students’ rates of progress based on their starting and ending points. It doesn’t matter where a runner places in a race if we don’t know how far he has run. Also, because schools are not teams, we should be more concerned with the progress of individual students than with the class average.

The MCAS and similar tests are inappropriate for the high-stakes purposes to which they are increasingly being put.

For these reasons, the Massachusetts comprehensive assessment and similar tests are inappropriate for the high-stakes purposes to which they are increasingly being put. MCAS is used to rate school and class absolute performance against a standard, without knowing where the children started the year. Does that sound like the cross-country race?

The state may decide to impose sanctions on schools and teachers based on these results in the near future. Even more perniciously, it will soon deny graduation to individual kids based on their test scores, still without understanding their individual progress or effort. MCAS may not be a bad test, any more than the medical tests used in the race above are bad. Rather, it is the inappropriate use of the tests, in both instances, that causes the problem. This inappropriate use makes the race analogy relevant.

This raises the question of standards for testing. What ethical standards should obligate states when they create high- stakes tests for students? Many states are or will soon deny graduation to students who fail a test. Others plan to deny promotion to students as young as 4th grade based on a single test score. These tests are called “high stakes” for a reason—they make a substantive difference in students’ lives. Are we not obligated, when charting a course that may alter a child’s future, to make sure we are doing the right thing? Are we not obligated to make sure our tests are fair to all, that they measure what they say they measure, and that we actually know how far children have progressed from their own individual starting points before claiming to know their ability?

Think back to the race described earlier. Isn’t tying graduation or promotion to a state test like saying that only students who arrive at the finish line within a certain time period will be considered winners? If we don’t know how far the runners have run, we still don’t know how fast they are. Some strong runners will not arrive in time, just as some promising and hardworking students may be flunked or denied a diploma based on largely arbitrary and inequitable criteria.

If MCAS scores rise because struggling students drop out, is that progress? If students, teachers, and schools are punished for slower finish times when they had farther to run, will that improve schools? It’s easy to see why parents of runners who run the farthest object to ranking runners according to who crossed the finish line first.

Despite the rhetoric, MCAS and many other state tests take this same approach, tying high stakes for students to this uneven race. In Texas, one result has been a significant decrease in the number of poor and minority graduates. Do we know enough about these students (or their schools) to say that they are failures, or is it simply that they had farther to run?

There should be greater accountability in education, and it can be structured fairly. But accountability measures ought to assess the improvement of individual students based on their individual progress. If we continue to conduct tests as though they were races with different starting lines, people will be justified in thinking not only that the race needs fixing, but also that it has been “fixed.” Right now, we appear to be using education not as the great equalizer, but as the great divider—the institution that prevents those who start farthest behind from ever catching up.

Educational accountability does not have to be this way. But unless officials stop defending the indefensible and start working with their critics toward common goals, students will continue to suffer unfair consequences.


Donald B. Gratz is a senior associate and the coordinator of national school reform for the Community Training & Assistance Center in Boston. He also serves as a member of the Needham, Mass., board of education.

A version of this article appeared in the June 07, 2000 edition of Education Week as Fixing the Race

Events

This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
Artificial Intelligence Webinar
Managing AI in Schools: Practical Strategies for Districts
How should districts govern AI in schools? Learn practical strategies for policies, safety, transparency, and responsible adoption.
Content provided by Lightspeed Systems
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
Reading & Literacy Webinar
Two Jobs, One Classroom: Strengthening Decoding While Teaching Grade-Level Text
Discover practical, research-informed practices that drive real reading growth without sacrificing grade-level learning.
Content provided by EPS Learning
Jobs Virtual Career Fair for Teachers and K-12 Staff
Find teaching jobs and K-12 education jubs at the EdWeek Top School Jobs virtual career fair.

EdWeek Top School Jobs

Teacher Jobs
Search over ten thousand teaching jobs nationwide — elementary, middle, high school and more.
View Jobs
Principal Jobs
Find hundreds of jobs for principals, assistant principals, and other school leadership roles.
View Jobs
Administrator Jobs
Over a thousand district-level jobs: superintendents, directors, more.
View Jobs
Support Staff Jobs
Search thousands of jobs, from paraprofessionals to counselors and more.
View Jobs

Read Next

Standards & Accountability How Teachers in This District Pushed to Have Students Spend Less Time Testing
An agreement a teachers' union reached with the district reduces locally required testing while keeping in place state-required exams.
6 min read
Standardized test answer sheet on school desk.
E+
Standards & Accountability Opinion Do We Know How to Measure School Quality?
Current rating systems could be vastly improved by adding dimensions beyond test scores.
Van Schoales
6 min read
Benchmark performance, key performance indicator measurement, KPI analysis. Tiny people measure length of market chart bars with big ruler to check profit progress cartoon vector illustration
iStock/Getty Images
Standards & Accountability States Are Testing How Much Leeway They Can Get From Trump's Ed. Dept.
A provision in the Every Student Succeeds Act allows the secretary of education to waive certain state requirements.
7 min read
President Donald Trump holds up a signed executive order alongside Secretary of Education Linda McMahon in the East Room of the White House in Washington, Thursday, March 20, 2025.
President Donald Trump holds up a signed executive order alongside Secretary of Education Linda McMahon in the East Room of the White House in Washington, Thursday, March 20, 2025.
Ben Curtis/AP
Standards & Accountability State Accountability Systems Aren't Actually Helping Schools Improve
The systems under federal education law should do more to shine a light on racial disparities in students' performance, a new report says.
6 min read
Image of a classroom under a magnifying glass.
Tarras79 and iStock/Getty