The rapidly changing debate on how to account for student achievement in teacher evaluations is putting teacher-district relationships to the test across the country.
More than half the states now require districts to take student achievement into account when evaluating a teacher’s performance. In most cases, that’s calculated through a so-called value-added model that attempts to account for a teacher’s role in a student’s growth over the course of a year, via test scores and other performance measures. Most of the push for value-added evaluation systems has come only in the past few years, driven in part by the federal Race to the Top grants, which gave extra weight to states that included achievement-based teacher evaluations in their applications, and by the federal Teacher Incentive Fund grants, which so far have supported experiments in 175 districts in 33 states using student achievement in teacher performance-pay plans.
“Some of the fear and worries we’re seeing now, not just from unions but across the board, are because we’ve seen an enormous amount of change and uncertainty in a short amount of time,” said Christopher A. Thorn, the associate director of the University of Wisconsin’s Value-Added Research Center and its director for data quality and systems innovation. Moreover, he said, “Budget pressures have actually stressed districts and pushed people to do these things. Forget about paying teachers. That’s a minor part of the reforms we see in districts; they’re changing who they hire, they’re changing their leadership systems, they’re changing professional development. They are changing everything in the system.”
Value-added, or growth, models track individual students’ test scores from year to year, which advocates say can help isolate the effect of the instruction that students receive during one school year from their academic backgrounds and prior education experiences. Critics—including many teachers’ unions—argue that annual test scores do not give a full picture of student growth, and that the statistical models used for the measures are not designed to evaluate teachers.
Mr. Thorne’s research center, which the U.S. Department of Education contracted to provide technical assistance on the Teacher Incentive Fund grants, is now evaluating how the 175 participating districts have implemented value-added systems, including how the implementation has affected the working relationships between teachers and administrators. The different models vary widely in the subjects covered and the types of student characteristics taken into account, and researchers and teachers’ unions alike have voiced concern that the systems do not give an accurate picture of teacher performance, particularly for teachers outside mathematics and reading, the subjects tested under federal accountability requirements.
In districts from California to Florida, experts say value-added evaluation systems are proving a catalyst to bring out the true strength—or weakness—of districts’ relationships with their teachers.
“There have been a lot of game changers, particularly in the last two to three years,” Mr. Thorn said. “Race to the Top has changed the game for many winning states; there are places where people were collaborating, and RTT has thrown a wrench in the works because it changed the rules of the game—or, to the contrary, places where it has been helping collaboration, in states where they had never talked about teacher effectiveness at all.”
The combination of political and budget pressures may also give all sides more incentive to work together than ever before. “Public schools are in a huge state of transformation across the nation. If a place is only accustomed to really contentious old-style negotiation, you’re going to have a lot of really important issues that are going to be forced on you by state legislatures or Congress if you don’t figure it out for yourselves,” said Jean Clements, the president of the Hillsborough County teachers’ union in Tampa, Fla. “This is not a place where you hold each other hostage and say I’m not going to talk to you about the new evaluation system until you do X, Y, and Z.”
“As somebody practicing in public schools, I absolutely want to have a say in what that change looks like,” she added. “I don’t want to abdicate that ability to people outside our district because we don’t know how to sit down and talk to each other.”
Even when districts and teachers initially commit to working together, the highly public debates and scrutiny that have surrounded value-added teacher-evaluation systems can strain relations, as the 670,000-student Los Angeles district has learned.
Through case studies, the Washington-based National Commission on Teaching and America’s Future has distilled several ways that teachers’ unions and district administrators can lay a foundation for a close collaborative relationship before controversial issues like teacher-evaluation systems come up.
• Systemic reform must include active, formal and informal cooperation between district administrators and teacher representatives.
• All stakeholders must agree on a comprehensive vision focused on student learning and improving instruction. Any improvement proposal, implementation or budget plan, professional development, or monitoring and assessment system should be viewed through this lens.
• Key stakeholders should set aside a dedicated time and space to meet to work out details, particularly when establishing the collaboration.
• All sides should agree to shift the focus of negotiations to a shared goal of student achievement.
• All sides should actively develop, nurture, and protect trust and honesty in the collaboration.
• All stakeholders should be kept informed—and should inform their constituencies—about changing goals and progress. This also means ensuring people understand how changes will work and their ramiﬁcations.
• Leaders should reach out to outside foundations, funders, and others as a team and present a common agenda.
Read the full report, “Reducing the Achievement Gap Through District/Union Collaboration.”
Los Angeles initially started developing a value-added teacher-evaluation system in response to a 2009 series of articles in the Los Angeles Times that highlighted the difficulty of firing teachers, even those who committed acts that could be subject to criminal prosecution.
The school board launched a task force intended to evaluate and overhaul the teacher-evaluation system. Both teachers and administrators found it wasn’t fair or valid, according to Drew Furedi, the executive director of the district’s office of talent management.
“It was the Noah’s Ark of task forces: We had two of everyone,” Mr. Furedi recalled, from teachers and principals to union and central-office staff, parents, students, and community leaders. The group of about 35 met for six months, setting up criteria for a new, comprehensive evaluation system.
“When you get groups like that together, it takes a while to build trust; you tend to make statements rather than have discussions,” Mr. Furedi said. “It took months of conversations, but in the subcommittees, people started to dig into those conversations more.”
By summer 2010, the task force had developed a plan to evaluate teachers using four measures: value-added student-assessment data, plus separate observations by principals, teacher leaders, and peer teachers. The group was still discussing what student-outcome data to include in the value-added measure in August 2010, when the Times broke a second series of reports on teacher performance, this time using a separate value-added calculation to rank teachers’ effectiveness publicly, by name.
“That became very much the discussion of the moment, whether people were defining teacher effectiveness using value-added,” Mr. Furedi said. “Lost in the hubbub around that issue was that our task force had months ago said this should be part of the effort to understand not just teacher effectiveness but evaluation, support, and the development process.”
In the fallout, the initial task force has not continued to meet, and the 670,000-student district has struggled to build consensus with its teachers on an evaluation system. Although 1,000 teachers volunteered for a district pilot program to test an evaluation system based on its recommendations, the union has filed a complaint with the California labor-relations board to stop the pilot.
“There are some in the union-elected leadership who’ve said their concern and fear is that this is just a path to get to test scores being as big a part of this [evaluation] as possible, and we’ve tried to say continuously that that is not what we want to do at all,” Mr. Furedi said.
The highly public, and national, debate prompted by the Times’ value-added series “definitely created some challenges—how do you then move from there to developing a partnership with all stakeholders?” Mr. Furedi said.
Houston’s school district has had a steep learning curve on the importance of collaboration in crafting evaluation measures for its merit-pay program, according to John C. Hussey, the chief strategy officer of the Columbus, Ohio-based Battelle for Kids, which provides technical support for developing teacher-evaluation systems, including Houston’s. “Although they had a cordial relationship with the union before this started, there was not collaboration. The unions were not engaged at a strategic level; they were consulted, informed, communicated to.”
Texas does not give collective bargaining rights to public employees like teachers, and Gayle Fallon, the president of the Houston Federation of Teachers, said historically teachers have lobbied for changes to state laws, which she said is often less contentious than going through the local school board. In many cases, state law includes provisions on arbitration and even planning periods as detailed as normal contract language.
That approach backfired in 2005, when Houston first experimented with value-added systems for the bonus-pay program. “Houston did just about everything wrong that you could do wrong initially,” Mr. Hussey said. “That first year when they tried to do everything themselves, it was a disaster.”
The 204,000-student district devised the value-added calculation for its merit-pay program by itself, without major teacher or outside input. Ms. Fallon recalled that many teachers found out about their bonuses not from the district, but from a story about them in the Houston Chronicle. It held only two voluntary open meetings to answer teachers’ questions about the system before it was launched, and then miscalculated about 100 of the merit-pay awards and a week later had to ask some teachers to return their checks. Some of them did so, Ms. Fallon said; others, who had already spent the money, simply quit.
Julie Baker, Houston’s chief officer of major projects, said the school board was surprised at the vehemence of the backlash. The first school board meeting after the merit-pay plan started prompted concerns from the fire marshal because the meeting room was “packed to overflow with teachers who were there to complain about the system,” she said.
Districts that involve teachers in the development of value-added systems tend to air and address such concerns early in the process and establish a plan to account for any instability in the first years of implementing the model, Mr. Hussey said. “When you have union or teacher involvement, you tend to have more measures involved, and they are more cautious in the way they use value-added for accountability purposes,” he said.
In response, Houston has tried to take a different tack to develop its teacher-evaluation system. It added outside researchers to set up a new value-added system and hired Battelle for Kids to help bring teachers back into the discussion. More than 2,600 teachers participated in the design discussions, through surveys, focus groups, and committees. It also recorded classroom observations of the teachers rated most effective, for use in staff training. “Once you can put a face to who these measures are intended to identify, you start to get more buy-in,” Mr. Hussey said. “Misconceptions that the teachers who get value added are these teachers who drill and kill get washed away with the videos that show that these are really good teachers.”
Teachers still participate in a biweekly student-performance working group and a new one dedicated to implementing the evaluation system, though Ms. Fallon said the union has filed a grievance with the state over how the advisory committees were formed. The district will roll out the observation protocols for the evaluation system this year, with value-added calculations added next year, due to a delay in state assessment results required for the measure, Ms. Baker said. In the meantime, the district is ramping up training for teachers on how the new system works and plans regular checks on how implementation progresses in the next year.
“What we don’t want to have happen is next spring to have a poll and find out people think the new system is no better than we had before,” Ms. Baker said. “You have to have a comprehensive engagement, not just communication but real engagement with all your stakeholders.”
Collaboration has improved somewhat in the district in the wake of the value-added debate over merit pay, Ms. Baker said. The May school board meeting at which it approved the new evaluation system was much less contentious than the previous merit-pay meeting, she said; three teachers’ union representatives spoke against the move, but eight other teachers spoke in favor of the plan.
Yet Ms. Fallon of the teachers’ union said the relationship between teachers and district administrators has “gotten a whole lot more hostile,” and she said the district still has not built a strong foundation for partnering with teachers in the future. “The fact that there probably isn’t a teacher in Houston who understands the value-added [system] and how they calculate it doesn’t matter to them,” Ms. Fallon said of the district. “A lot of teachers consider it like winning the lottery: ‘I don’t know what I did to deserve this, but I’ll enjoy spending it.’
“It’s going to take a lot to rebuild trust, not just in the union but in the workforce, because we have a very demoralized workforce right now,” she said.
It’s too early to tell whether Houston will be able to build strong enough support to sustain the evaluation system in the long run, Mr. Hussey said. In the years since Houston first experimented with value-added merit pay, for example, it has had two superintendents and more than 50 new principals, he noted. “As leadership changes, to sustain these types of efforts, you need to institutionalize them, and the only way to do that is in a collaborative way,” he said.
“I think if you have a collaborative environment, it makes the implementation of value-added for accountability purposes much easier,” Mr. Hussey said. “It’s a much more difficult situation if you don’t have that collaboration at the beginning.”
The Hillsborough County district, now in the first year of rolling out its Empowering Effective Teachers evaluation system, has that kind of history.
Amanda W. Newman, a 2nd grade teacher at Valrico Elementary School, said the district has always involved teachers from every grade level in decisions on everything from the reading curriculum to basic school calendar planning.
“In a lot of districts, I know these would be sort of top-down decisions,” Ms. Newman said. “But the bottom line is all those decisions affect us. Just having that ability to have a voice in it has always been something we’ve had.”
District officials involve teacher representatives, including union members, on all decisionmaking committees and hold monthly meetings with teachers from each of the district’s 243 schools.
“It’s not just about bringing [teachers] to the table upfront, but keeping them at the table and respecting their expertise, trying to engage them in all the work that we are doing,’ said Tracye Brown, the director of communications and project management for the district’s Empowering Effective Teachers initiative. “I think that’s why it happened so naturally for us, because it wasn’t based just on contract negotiations, but around making sure our teachers were supported throughout.”
Even in districts with a history of collaboration, value-added discussions can raise concern, experts say, because they are based on complicated statistical models that teachers generally can’t calculate on their own.
“Inevitably, even in a district that does an A-plus job and sticks the landing, there will still be a handful of teachers who don’t like [value-added],” said Charmaine Mercer, the policy and research director for the Los Angeles-based Communities for Teaching Excellence, a nonprofit that works with districts on value-added evaluation systems, “and they may not be bad teachers; they may be the teachers who were always rated ‘satisfactory’ under the old system, and now with the new system and greater granularity, they are simply average.”
Ms. Clements, Hillsborough’s union president, said the discussion over value-added has increased both the collaboration in the 195,000-student district and the need for it. The district has spent the last 2½ years developing its evaluation system, with massive teacher participation. After an initial 8,000-teacher survey and focus groups to identify concerns, it drew up an advisory group of 30 teachers, including representatives from each grade span and content area, as well as special areas such as career and technical education and instructors for students with disabilities.
“What was really cool was that in these meetings it wasn’t just a bunch of teachers sitting around having pie-in-the-sky discussions; we had top district officials in there taking furious notes and asking for clarification on criteria we were recommending,” Ms. Newman said. “They didn’t want people who would just sit there and nod and agree with them.”
Teachers received their first effectiveness score this fall, 60 percent of which is based on principal and peer observations, including a 60-minute formal class observation and several 20- to 30-minute surprise visits, based on a teacher’s prior evaluations. The value-added component—which represents 40 percent of a teacher’s score—measures student growth during the past three years on several tests, including the state accountability test, Stanford achievement tests, and end-of-course exams.
The multiple assessments are intended to help the district evaluate teachers who work in subjects and grades outside math and reading.
“No matter how stressful teachers have gotten as we tried to wrap our heads around this very different way of being evaluated, the best thing is teachers are now talking with other teachers about good teaching,” Ms. Newman said. “They are talking about that rubric and how to demonstrate proficiency. That’s been something that’s really exciting to see.”
Coverage of policy efforts to improve the teaching profession is supported by a grant from the Joyce Foundation, at www.joycefdn.org/Programs/Education.
A version of this article appeared in the November 16, 2011 edition of Education Week as Evaluation Reforms Put Partnerships To the Test