School systems around the country are in transition. Propelled by a combination of evidence, logic, and intuition about the need for fundamental improvements in the content and management of public education, districts and states are continuing to exhibit an innovative drive that has in some ways been a hallmark of the American system since its beginnings nearly two centuries ago.
Whether intended or not, the basic approach to reform increasingly resembles an engineering cycle: prototype development, small-scale application, evaluation, correction, and (hopefully) continued improvement. A key to the validity and success of such processes is the collection of data and the objective analysis of progress, which in the world of education policy and politics is captured at least partially by the word “accountability.”
Indeed, the idea of accountability is deeply etched in American political and scientific culture. Citizens whose tax dollars go toward public goods such as education, defense, and transportation believe they have the right to seek answers to such seemingly simple questions as: Is our money spent wisely and ethically? Are our programs well-managed and effective? Do all our citizens benefit equitably? Do major policy reforms really make any difference? Policy decisions are ultimately made by legislators and policymakers, but they are often infused with data and analysis from independent and credible sources.
A quick scan of publicly available Internal Revenue Service data reveals a sizable demand for research aimed at informing public policy. The RAND Corp., for example, reported that for its work to “help improve policy and decisionmaking through research and analysis,” it received grants and contributions in 2009 totaling roughly $245 million. Together with just three other distinguished nonprofits in the “evidence sector”—the American Institutes for Research, the Urban Institute, and the National Academy of Sciences—total revenues were roughly $872 million that same year.
In the current era of school reform, when districts are shifting control from school boards to mayors, when traditional models of local schooling are supplanted partly or fully by market-oriented voucher or charter programs, when curricula and assessments are undergoing radical overhauls, when content and performance standards are being developed through consortia of states and private education organizations, and when development and compensation of teachers and administrators are subject to dramatic revision, it is no surprise that accountability for results becomes a high political priority.
We believe our recent experience in the nation’s capital, where public education has long been a focus of public concern (and sometimes despair), provides useful lessons to other districts and states involved in reform and its evaluation. When the District of Columbia schools were subjected to a major jolt with the passage of the 2007 Public Education Reform Amendment Act, or PERAA, the act came with a mandate for independent evaluation.
The “simple” question in this case was whether the reforms—which shifted control of the city’s schools from an elected board to the office of the mayor, created the position of schools chancellor, established the office of the state superintendent of education, set up a public charter schools board, and called for a host of other management and organizational changes—are having the desired effects on student achievement and other valued outcomes of education.
Fortunately, PERAA’s framers understood that, though their question was urgent, it was not simple, and that answering it would take time and expertise. And they knew that the usefulness of the answer would hinge on its perceived objectivity, independence, and credibility.
Vincent C. Gray, then the chairman of Washington’s city council (and today its mayor)—with the concurrence of then-Mayor Adrian M. Fenty, Deputy Mayor Victor Reinoso, Schools Chancellor Michelle A. Rhee, and state Superintendent Kerri L. Briggs—turned to the National Research Council, or NRC, the operating arm of the National Academy of Sciences, to design and launch a five-year study. In taking on the assignment, the NRC saw a rare opportunity to help shape the local and national reform agenda while contributing advice to its host city.
The NRC’s first report, released in March, provides a detailed model for evaluation in other districts, allowing for flexibility in setting evaluation questions and research priorities. Its key recommendation to the city is to establish and support an independent, sustainable entity that would develop indicators of educational performance, report annually on progress, and conduct in-depth studies of high-priority issues. The rationale for a continuing program of evaluation—as distinguished from one-time snapshots with “gotcha” conclusions—is based on the premise that first impressions, though important and interesting, are an inadequate foundation for long-term policy.
As the report notes, the most vexing problems faced by a reform-oriented school district require attention over time. For example, if mobility between traditional and charter schools is an indicator of parents’ assessment of the quality of teaching and learning, then these data need to be collected and analyzed at regular intervals and not just on a one-shot basis.
Similarly, if teacher quality is to be assessed reliably with data that include student achievement (itself a challenging task), a good evaluation system needs to have longitudinal capacity that allows for monitoring of change over time.
Only with a sustainable and continuing evaluation system can indicators be developed and used to track progress and improve the schools. As issues rise on the agenda of public concern—alignment between curriculum and professional development, the system’s capacity for educating vulnerable children with special needs, the condition of facilities and the system’s physical plant, to name a few—the evaluation system will have to be flexible, adaptive, and continuous.
A shared commitment to providing useful and relevant information requires a balance between the demands of good science and the urgency of sound policy."
The NRC report is cautious, especially when it comes to causal inferences based on cross-sectional data, and for some readers the lack of a thumbs-up or thumbs-down assessment of PERAA may be frustrating. The fact is that test scores did rise in the past several years, and, assuming they reflect real learning gains, that is good news. But evidence to support the conclusion that the gains are authentic indicators of improved learning and that they are at least partially attributable to the PERAA reforms is just not there. (Recent allegations of mischief, e.g., apparent evidence of the erasures of wrong answers on assessments, emerged well after the report was completed.) Detailed longitudinal data about students and teachers who have remained in the schools are needed to answer that question—a point worth considering by education reformers and evaluators anywhere mobility is an issue.
In fact, the NRC report issues an even more general (and potentially controversial) finding about the reliance on test scores as indicators of educational performance. The report affirms the importance of objective measures of achievement and argues strongly for continuing to include testing in the long-term evaluation strategy.
However, test results alone are inadequate as a basis for making judgments about the achievement of students, the quality of teachers, the performance of schools, or the success of reforms. Educational reform requires a broad and deep longitudinal-evaluation framework that includes academic performance, postschooling outcomes, and a full array of institutional processes.
Is there a bottom line in the NRC report about Washington schools? Yes, but it’s not as definitive as some might wish. Evidence available now indicates progress in some areas (that’s good news) and stagnation in other areas (that’s less-good news). Most important, however, are the strong signs city leaders have shown of a willingness to develop an innovative evaluation system that will provide rigorous and credible data to guide continued improvement over the long term (the best news of all).
Our preliminary answers about the District of Columbia reform law provide an important lesson for education policy research and the role of evaluation generally. Seemingly simple questions often evoke complicated and time-consuming answers. To reflect that reality, educational evaluations should resist the temptations to either be disrespectful of seemingly simple questions or to offer oversimplified answers. Policymakers and education researchers should continue to strive for balance between the competing goals of political timeliness and scientific credibility. A shared commitment to providing useful and relevant information requires a balance between the demands of good science and the urgency of sound policy. Across the nation, sustained commitment to that balance, as exemplified in the NRC’s report, should be a high priority.
A version of this article appeared in the July 13, 2011 edition of Education Week as ‘Simple’ Reform Questions, But No Easy Answers