Back in May 2010, hundreds of the nation’s education foundation, policy, and practice elites were gathered for the NewSchools Venture Fund meeting in Washington to celebrate and learn from the most recent education reform policy victories in my home state of Colorado and across the country.
The opening speeches highlighted the recent passage of Colorado Senate Bill 10-191—a dramatic law which required that 50 percent of a teacher evaluation be based upon student academic growth. This offered a bold new vision for how teachers would be evaluated and whether they would gain or lose tenure based on the merits of their impact on student achievement.
Colorado would be one of several “ground zeros” for reforming teacher evaluation in the country. Many, including myself, thought these new state policies would allow our best teachers to shine. They would finally have useful feedback, be differentiated on an objective scale of effectiveness, and lose tenure if they weren’t performing. Teachers would be treated like other professionals and less like interchangeable widgets.
Colorado’s law and similar ones in other states appeared to be sound, research-backed policy formulated by education reform’s own “whiz kids.” We could point to Ivy League research that made a clear case for dramatic changes to the current system. There were large federal incentives, in addition to private philanthropy fueled by the Bill & Melinda Gates Foundation, encouraging such changes. And to pass these teacher-evaluation laws, we built a coalition of reform-minded Democrats and Republicans that also included the American Federation of Teachers. Reformers were confident we had a clear mandate.
And yet. Implementation did not live up to the promises.
Colorado Department of Education data released in February show that the distribution of teacher effectiveness in the state looks much as it did before passage of the bill. Eighty-eight percent of Colorado teachers were rated effective or highly effective, 4 percent were partially effective, 7.8 percent of teachers were not rated, and less than 1 percent were deemed ineffective. In other words, we leveraged everything we could and not only didn’t advance teacher effectiveness, we created a massive bureaucracy and alienated many in the field.
It was wrong to force everyone in a state to have one 'best' evaluation system."
First, the data. We built a policy on growth data that only partially existed. The majority of teachers teach in states’ untested subject areas. This meant processes for measuring student growth outside of literacy or math were often thoughtlessly slapped together to meet the new evaluation law. For example, some elementary school art-teacher evaluations were linked to student performance on multiple-choice district art tests, while Spanish-teacher evaluations were tied to how the school did on the state’s math and literacy tests. Even for those who teach the grades and subjects with state tests, some debate remains on how much growth should be weighted for high-stakes decisions on teacher ratings. And we knew that few teachers accepted having their evaluations heavily weighted on student growth.
Second, there has been little embrace of the state’s new teacher-evaluation system even from administrators frustrated with the former system. There were exceptions, namely the districts of Denver and Harrison, which had far fewer highly effective teachers than elsewhere in the state. Both districts invested time and resources in the development of a system that more accurately reflects a teacher’s impact on student learning. Yet most Colorado districts were forced to create new evaluation systems in alignment with the new law or adopt the state system, and most did the latter. This meant that these districts focused on compliance (and checking off evaluation boxes), rather than using the law to support teacher improvement.
Third, we continue to have a leadership problem. Research shows that teacher evaluators are still not likely to give direct and honest feedback to teachers. A Brown University study on teacher evaluators in these new systems shows that the evaluators are three times more likely to rate teachers higher than they should be rated. This is a problem of school and district culture, not a fault with the evaluation rubric.
Fourth, all of Colorado’s 238 charter schools waived out of the system.
We wanted a new system to help professionalize teaching and address the real disparities in teacher quality. Instead, we got an 18-page state rubric and 345-page user guide for teacher evaluation.
We didn’t understand how most school systems would respond to these teacher-evaluation laws. We failed to track implementation and didn’t check our assumptions along the way.
The new teacher-evaluation laws in Colorado and now 40 other states seem a classic example of putting policy ahead of practice. Great in theory, but unrealistic when it comes to implementation. The laws were constructed around a particular set of assumptions about school district capacity and commitment. We underestimated the propensity of districts to morph “innovations” into existing practice and treat the new evaluation laws as just one more compliance requirement. We also failed to understand the political and district costs of tying such laws to federal incentives, particularly given a strong ethos of local control in many school districts, like most of those in Colorado.
As a longtime educator and education advocate, I got caught up in the hubris. I helped construct and strongly supported the teacher-evaluation law but didn’t anticipate how the state education department and school districts would turn the law into practice. I figured it would be difficult to end up with something any worse than what was practice in 2009.
I believe the intention was right, but it was wrong to force everyone in a state to have one “best” evaluation system.
Going forward will be a challenge. Most teachers’ unions have not supported these new evaluation laws and will look for any excuse to gut them and go back to the world where there were no objective measures of teacher effectiveness.
But we need to dig into what has happened—to understand what worked, what did not, and why. It’s not too late to acknowledge our mistakes and switch course. Instead of doubling down on ineffective policies, we must confront the quagmire and work toward a better solution. We should work back from the practice in our best schools and districts. Improving education requires that push forward, and it won’t happen overnight.
A version of this article appeared in the April 05, 2017 edition of Education Week as The New Teacher-Evaluation Laws: Education’s Pyrrhic Victory?