Accountability Systems ‘Mediocre,’ Study Finds

By Lynn Olson — February 18, 2004 4 min read
  • Save to favorites
  • Print

Not a single state in America has its whole system of standards, tests, and accountability policies “right,” argues a report from two Washington-based groups.

“Grading the Systems: The Guide to State Standards, Tests and Accountability Policies,” is available from the Thomas B. Fordham Foundation. For more information about study methodology, e-mail AccountabilityWorks at: (Report requiresAdobe’s Acrobat Reader.)

“Grading the Systems: The Guide to State Standards, Tests, and Accountability Policies” judges states on the quality of their reading and math standards, the content of their tests, the match between their standards and tests, test rigor, their technical trustworthiness, and state accountability policies.

The project, underwritten by the Smith Richardson Foundation, evaluated 30 state systems in each category on a scale from 1 to 5, with 5 denoting “outstanding” performance.

Although some states—such as Massachusetts, Pennsylvania, and Virginia—got high marks in three out of six categories, the multistate average is only “mediocre,” according to the report published by the Thomas B. Fordham Foundation and AccountabilityWorks, a consulting group.

But what’s most notable about the review is what’s missing.

See Also...

Read the accompanying story, “Ratings by State.”

States were included in the study based on whether copies of the state-administered tests in reading and mathematics (or an equivalent form of such tests) could be obtained for analysis. A few states, such as Massachusetts, post previously administered tests on their Web sites. Colorado, Illinois, Michigan, New York, and Pennsylvania agreed to make secure test forms available for review. In other cases, the reviewers used an alternate form of an off-the-shelf test that was available from the publisher but not identical to the one given in the state.

But in the vast majority of cases, the authors complained, states would not release basic information that should be available.

“While we did feel that it was a significant accomplishment that we could look at 30 state testing systems, it was unfortunate that 20 states did not feel the need to share their tests under secure conditions,” said Theodor Rebarber, AccountabilityWorks’ president.

Given that state accountability systems drive so much of what happens at the district, school, and classroom level, he said, “there’s a need for some external, independent review of what’s under the hood, so to speak.”

Bill Reinhard, a spokesman for the Maryland education department, said officials there could not recall the request. But, “we rarely let any kind of researchers look at our tests, even under secure conditions, because we don’t own a lot of the test items.

“There’s a lot of hoops to go through,” he added, noting that the process is difficult and time-consuming, “and if we started to do that, there’s some concern that we’d have to start doing it for a lot of different organizations.”

No Guarantee of Quality

To review state systems, the project assembled a team of individuals with expertise in the relevant academic-content areas and grades. With their advice, AccountabilityWorks established a set of “reference standards” that cover what the organization considers to be essential math and reading skills at each grade level. The reviewers then judged state systems against those standards.

The project also reviewed the “trustworthiness,” or reliability, of state tests based on criteria devised with help from Susan E. Phillips, a psychometrician and lawyer in private practice.

The project examined both criterion-referenced tests crafted by states to specifically match their standards and commercially produced norm-referenced tests that are used as part of state accountability systems. The latter measure how students stack up against a nationally representative sample of their peers.

But the study didn’t find that one type of test was always better.

In math, for example, the reviewers discovered that the content of norm-referenced tests was significantly better than that of criterion-referenced tests in the elementary and middle grades, and significantly worse at the high school level. That’s because, with a few exceptions, the norm-referenced tests for high schools incorporated only limited amounts of high school math, such as algebra or geometry.

The authors also found that while states’ custom- designed tests were better aligned with state content standards than were off- the-shelf exams, “the difference is not nearly as large as one might expect.”

But Mr. Rebarber said the weakest dimension of state systems was the rigor of state tests, or where the states set the cutoff scores needed to perform at the proficient level. Even states with challenging standards and tests often had cutoff scores that more closely resembled “minimum competency expectations,” he said. Massachusetts alone earned a strong overall rating on test rigor.

The researchers concluded that 18 states would have solid accountability systems if the federal No Child Left Behind Act was “fully and properly implemented.”


School Climate & Safety K-12 Essentials Forum Strengthen Students’ Connections to School
Join this free event to learn how schools are creating the space for students to form strong bonds with each other and trusted adults.
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Reading & Literacy Webinar
Creating Confident Readers: Why Differentiated Instruction is Equitable Instruction
Join us as we break down how differentiated instruction can advance your school’s literacy and equity goals.
Content provided by Lexia Learning
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
IT Infrastructure & Management Webinar
Future-Proofing Your School's Tech Ecosystem: Strategies for Asset Tracking, Sustainability, and Budget Optimization
Gain actionable insights into effective asset management, budget optimization, and sustainable IT practices.
Content provided by Follett Learning

EdWeek Top School Jobs

Teacher Jobs
Search over ten thousand teaching jobs nationwide — elementary, middle, high school and more.
View Jobs
Principal Jobs
Find hundreds of jobs for principals, assistant principals, and other school leadership roles.
View Jobs
Administrator Jobs
Over a thousand district-level jobs: superintendents, directors, more.
View Jobs
Support Staff Jobs
Search thousands of jobs, from paraprofessionals to counselors and more.
View Jobs

Read Next

Accountability Opinion What’s Wrong With Online Credit Recovery? This Teacher Will Tell You
The “whatever it takes” approach to increasing graduation rates ends up deflating the value of a diploma.
5 min read
Image shows a multi-tailed arrow hitting the bullseye of a target.
DigitalVision Vectors/Getty
Accountability Why a Judge Stopped Texas from Issuing A-F School Ratings
Districts argued the new metric would make it appear as if schools have worsened—even though outcomes have actually improved in many cases.
2 min read
Laura BakerEducation Week via Canva  (1)
Accountability Why These Districts Are Suing to Stop Release of A-F School Ratings
A change in how schools will be graded has prompted legal action from about a dozen school districts in Texas.
4 min read
Handwritten red letter grades cover a blue illustration of a classic brick school building.
Laura Baker, Canva
Accountability What the Research Says What Should Schools Do to Build on 20 Years of NCLB Data?
The education law yielded a cornucopia of student information, but not scalable turnaround for schools, an analysis finds.
3 min read
Photo of magnifying glass and charts.
iStock / Getty Images Plus