Accountability Commentary

But Are the Schools Getting Better?

By Heinrich Mintrop — December 08, 2008 7 min read

Judging by state tests, school accountability systems are a success. In most states, test scores are going up. And such gains confirm, for the proponents of accountability, that the systems are working. Critics, however, point out that there are ways to raise test scores without improving student learning.

Meanwhile, and despite the shifting meanings that test scores may hold, low-performing schools in high-stakes systems continue to feel the stigma of failure, while their high-performing counterparts are held up as exemplary.

But are schools measured as high-performing by their accountability systems actually better schools? And could others learn from them what to do better?

My colleague Tina Trujillo and I wanted to know the answer. We sampled a number of California schools from both the top and bottom of the performance spectrum, controlling for demographics, and ended up with nine urban middle schools for our study. The differences between our high and low groups on the state performance indicator, viewed in the context of the state as a whole, amounted to about five years of growth. Such score differences ought to be tangible in the life and quality of schools, we surmised, if accountability measures are valid and relevant for school improvement.

For our study, we imagined educators from the low-performing schools traveling to the top-performing exemplars. What would they investigate during their visits? They’d check the schools’ orderliness; find out if students felt safe, cared for, engaged with learning, and challenged; observe teachers to see about time on task, instructional formats, and the cognitive complexity and tone of instruction; and sample student writing as to mechanics and content. They also would be interested in faculty cohesion, teachers’ and administrators’ sense of responsibility, innovativeness, strength of leadership, and improvement strategies. And they would want to know whether the system itself was important and meaningful for the teachers. Then we set about to translate our imagined travelers’ inquisitiveness into systematic research with robust survey, observation, and evaluation tools—and we controlled for biases.

As accountability systems take hold of educators’ minds and structure our practices, some of us have gotten into the habit of using higher performance scores as shorthand for higher school quality and more-successful improvement. But could we really make this connection if we didn’t know a school’s performance status? If we could do that, we’d feel better about the system’s validity and relevance for school improvement.

The idea behind current accountability systems is one of beautiful simplicity: Select a few key performance measures and enforce them with vigor, and all else will fall into place. ... Our study suggests otherwise.

To keep biases in check, we determined that our imagined practitioners would travel to schools without knowing their test scores; in other words, the researchers conducted all of their analyses blind. We constructed a school-quality profile with 56 non-test-based measures for each school. Using these measures, two independent raters judged whether a given school was in the top or bottom performance group, and we conducted statistical comparisons, all with concealed test scores.

Our findings were surprising and instructive. To sum up, we were unable to correctly classify a sufficient number of schools in their respective test-based performance groups. Student surveys told us that, across all schools, regardless of the wide gaps in test-based performance, students felt safe, but only mildly challenged and engaged with learning. Schools in both the top and bottom groups were also quite similar in the quality of their observed instruction. Scores on student writing samples were slightly higher in the top group, but the difference was not statistically significant. Alas, if our imagined travelers had expected to encounter visible signs of an overall higher quality in the high-performing schools, they would have looked in vain.

One school in the high-performing group, however, did stand out with high test scores, higher lesson quality, and more-effective adult relationships—yet not with deeper student engagement. This was, in fact, one of the fastest-growing middle schools in the state for its demographic profile. It had strictly aligned its curriculum to state assessments, abandoned nonacademic subjects, and folded social studies and science into language arts and math. Students below grade level were given extra periods in a remedial literacy program. The school embraced the accountability system and used data to carefully track remedial needs. We encountered no direct test drilling there, and the adults were earnest, responsible, and oriented toward social justice.

Contrast this touted high-performing school with one of the lowest-growth schools in the study. Despite a formidable difference in test scores, the school in the bottom group received remarkably similar ratings on instructional quality from the blind raters. Here instruction was lively and complex, but test scores were depressed. The school ignored and rejected the accountability system passionately as unworthy of professionals.

This school was an exception in that regard. The others in the study, top or bottom, pursued school improvement much as the high-testing school did. Tightening up, curricular alignment, more literacy remediation, and de-emphasizing nontest subjects were the most prevalent activities. But in no instance, we found, was better implementation of such strategies reflected in better instruction or more engaged or challenged students.

In the successful schools, teachers did manage to improve on standardized-test scores. Our observations suggest that these schools were committed to a highly focused coverage of standards-aligned materials within highly structured literacy and language arts programs taught in differentiated learning groups. Thus, our travelers seeking ways to improve their schools would have had to settle on a much narrower definition of quality, one that homes in on attitudes and behaviors that are quite proximate to the effective acquisition of standards-aligned and test-relevant knowledge, but that may go beyond mere drilling for the test.

Nine schools, however carefully selected and studied, are not a sufficient number of cases for making sweeping statements, to be sure. But if the pattern we detected among our nine schools were more widespread, we would have to rethink school accountability. Raising test scores is not a trivial challenge for many schools. Our highest-performing ones showed us how much hard work goes into it. But if we want to encourage educators to think more intensively about students’ joy of learning and teachers’ instructional practices, we have to find ways to move beyond a narrow agenda of alignment and standardization. Given that schools these days are fundamentally driven by external assessments, we would have to start by constructing assessment systems with different incentives and indicators that train the lens on what we value in education beyond test scores. And we would have to legitimize these measures on an equal footing with test scores to give schools room to explore and develop.

The idea behind current accountability systems is one of beautiful simplicity: Select a few key performance measures and enforce them with vigor, and all else will fall into place. It seems logical. If schools want to improve student test performance, will they not, sooner or later, shift their attention to instructional quality, student motivation, and all the other intangibles that make up the quality of the pedagogical relationship?

Our study suggests otherwise. Both high- and low-performing schools get stuck in a mode of school improvement that searches for the most direct connections among content, teaching, and testing. Students’ motivation for learning, as well as instructional quality, fade from view beyond the “rigorous” alignment and a “razor sharp” focus on material that needs to be re-taught.

Concentrating on a state test for the purpose of system monitoring makes sense, but bringing student engagement and teachers’ instructional practices to center stage requires more-complex performance profiles for schools. How to meld these more intricate profiles with the current architecture of test-based and sanctions-driven accountability will be next on the agenda of urban school reform.

The study on which this essay is based was published in the December 2007 issue of Educational Evaluation and Policy Analysis.

Related Tags:

A version of this article appeared in the December 10, 2008 edition of Education Week as But Are the Schools Getting Better?


This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Student Well-Being Webinar
Measuring & Supporting Student Well-Being: A Researcher and District Leader Roundtable
Students’ social-emotional well-being matters. The positive and negative emotions students feel are essential characteristics of their psychology, indicators of their well-being, and mediators of their success in school and life. Supportive relationships with peers, school
Content provided by Panorama Education
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
School & District Management Webinar
Making Digital Literacy a Priority: An Administrator’s Perspective
Join us as we delve into the efforts of our panelists and their initiatives to make digital skills a “must have” for their district. We’ll discuss with district leadership how they have kept digital literacy
Content provided by Learning.com
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
School & District Management Webinar
How Schools Can Implement Safe In-Person Learning
In order for in-person schooling to resume, it will be necessary to instill a sense of confidence that it is safe to return. BD is hosting a virtual panel discussing the benefits of asymptomatic screening
Content provided by BD

EdWeek Top School Jobs

Teacher Jobs
Search over ten thousand teaching jobs nationwide — elementary, middle, high school and more.
View Jobs
Principal Jobs
Find hundreds of jobs for principals, assistant principals, and other school leadership roles.
View Jobs
Administrator Jobs
Over a thousand district-level jobs: superintendents, directors, more.
View Jobs
Support Staff Jobs
Search thousands of jobs, from paraprofessionals to counselors and more.
View Jobs

Read Next

Accountability Opinion Absenteeism Is the Wrong Student Engagement Metric to Use Right Now
In a post-pandemic era for school accountability, let’s focus on measuring what matters.
Sara Johnson, Annette Anderson & Ruth Faden
4 min read
Figure being erased.
Accountability Biden Education Team Squashes States' Push to Nix All Tests but Approves Other Flexibility
The department has telegraphed its decision to deny states' requests to cancel federally mandated tests for weeks.
3 min read
A first-grader learns keyboarding skills at Bayview Elementary School in San Pablo, Calif on March 12, 2015. Schools around the country are teaching students as young as 6 years old, basic typing and other keyboarding skills. The Common Core education standards adopted by a majority of states call for students to be able to use technology to research, write and give oral presentations, but the imperative for educators arrived with the introduction of standardized tests that are taken on computers instead of with paper and pencils.
The U.S. Department of Education denied some states' requests to cancel standardized tests this year. Others are seeking flexibility from some testing requirements, rather than skipping the assessments altogether.
Eric Risberg/AP
Accountability Explainer Will There Be Standardized Tests This Year? 8 Questions Answered
Educators want to know: Will the exams happen? If so, what will they look like, and how will the results be used?
12 min read
Students testing.
Accountability Opinion What Should School Accountability Look Like in a Time of COVID-19?
Remote learning is not like in person, and after nine months of it, data are revealing how harmful COVID-19 has been to children's learning.
6 min read
Image shows a speech bubble divided into 4 overlapping, connecting parts.
DigitalVision Vectors/Getty and Laura Baker/Education Week