Assessment Opinion

Public Displays of Teacher Effectiveness

By Rachael E. Gabriel & Jessica Lester — December 15, 2010 7 min read

Recently, a New York state court heard arguments around whether or not to publicly release value-added scores of 12,000 New York City teachers. The court hearing came only a few months after the controversial release of Los Angeles teachers’ scores. In late August of 2010, the Los Angeles Times began publishing a series of articles reporting on the quality of teachers and schools in the Los Angeles Unified School District, or LAUSD. In addition to the articles, the paper created a public, online database of individual teacher scores giving readers the power to hunt down “the ineffectives”—those teachers who apparently cause and sustain the achievement gap, many of whom, according to the Los Angeles Times, do so unknowingly. In this series, not only were we introduced to “the ineffectives,” we were also given one of many tools, value-added measurement, or VAM, to root such teachers out, and more easily identify “the miracle workers”—those teachers capable of leading their students to score higher than a statistical model predicted they would.

When we conducted an analysis of the discourse that surrounds this debate, what was most striking was not the fact that the scores were released, but the ways in which language was used to question and/or silence questions about the implications and outcomes of VAM. Though teacher effectiveness seems like a rallying cry the country can unite behind, the shape of the conversations about its measurement threatens to divide us.

See Also

For an opposing view on value-added measurement, see “Value-Added: It’s Not Perfect, But It Makes Sense,” (December 15, 2010).

When the Los Angeles Times published this series, the newspaper got what it wished for—a nationwide ripple effect—a discourse dispersed in talk and text that simplified and glorified the implications of a useful but not all-powerful tool. Throughout the series, the newspaper set up teachers as either one thing or another: effective or ineffective, good or bad, a detriment or a savior. With McCarthy-era tactics, the paper’s series flooded us with profiles of extreme-case formulations—examples so good, bad, or surprising that they almost seduced us into believing that “ineffectiveness” could be lurking anywhere, unbeknownst even to the teacher himself or herself, regardless of certification, reputation, or experience.

The Times forgot to share what those who study teacher effectiveness have been arguing for the last decade: Effectiveness is not a monolithic thing, but rather teachers are more or less effective across different subjects, students, and circumstances. So far, conversations about value-added measurement seem to use language in ways that present a single view of teaching and position teacher effectiveness as something static that can be estimated by a single statistic. Those who believe teacher effectiveness is flexible across subjects, students, and varying demands do not suggest that all teachers are good at something—some aren’t—but rather that the complexity of roles and expectations for teachers requires them to have a dynamic profile of effectiveness. Those who talk about VAM, as if it were both the crystal ball and the Holy Grail for education reform, would love for us to believe otherwise.

While it may seem that this debate is new news, in 2009, months before VAM was twinkling over the Los Angeles Times’ presses, several issues of Educational Researcher, the pre-eminent education research journal, were devoted to articles that outlined the complexity of identifying, let alone measuring, effectiveness in teaching. Six years earlier, the Journal of Educational and Behavioral Statistics published a special issue focusing exclusively on VAM. The overall conclusion of the editors was that VAM was valid only for school-level, not classroom-level, comparisons. Ironically, concerns around the reliability of value-added measurement are no longer central to the debate about publicly releasing individual teachers’ scores. Instead, its validity is most often called into question for the reason summed up by Charles G. Moerdler, a lawyer for the American Federation of Teachers, in a recent New York Times news article. “The information has no critical basis other than to facilitate a libel,” he said. “If it’s garbage in, it’s garbage out. Just because it’s a number, it doesn’t mean it’s suddenly objective.”

While the outcome of the New York court decision is pending, several New York news outlets, including The New York Times, have asked to publish the city’s teacher scores. But before New York news sources make the same mistake that their counterparts in Los Angeles made—making VAM seem like a litmus test capable of revealing who is and who isn’t an angel or criminal in the classroom—it may be useful to draw upon conversations about VAM that stretch back a bit further than this past August.

Ironically, the very researchers who were popularizing and citing the findings of earlier research aided by value-added analyses are now often quoted in opposition to some of its uses (e.g., Linda Darling-Hammond and Diane Ravitch). Most education researchers are quoted as arguing for “multiple measures” of effectiveness, yet these measures are never described. The plea for multiple measures is therefore constructed as a fuzzy, unknown bundle of “other” things—a soft, teacher-defending, union-loving idea with no evidence, let alone a “real” name. Yet, the names of those “other” things could be readily released: observational data; parent, student, and peer survey responses; portfolio reviews; and lesson analyses. Also, it’s important to remember that two 2010 studies, one performed by researchers from Mathematical Policy Research and another by John P. Papay from Harvard University, showed that even if measured twice in the same year, approximately one-third of teachers categorized as “effective” one year were categorized as “ineffective” the very next time, either because effectiveness is subject to dramatic change or because the measure itself is unstable or unreliable to begin with.

Though teacher effectiveness seems like a rallying cry the country can unite behind, the shape of the conversations about its measurement threatens to divide us.

In an op-ed essay for the New York Post in October, Joel I. Klein, the outgoing chancellor of the New York City schools, wrote, “So what is value-added data and what can it tell us? It starts with the idea of fairness.” His response, “it starts with the idea of fairness,” makes the concept of VAM seem rational, perhaps even inherently useful. Yet, while Klein purportedly supports the use of VAM, the New York Daily News reported that he also acknowledges “that the rating system doesn’t tell the whole story about teacher performance.”

These conflicting perspectives are a construct similar to the logic of the Los Angeles Times: Value-added measurement is not perfect, but it’s the best we have. In the end, this not-so-perfect-but-it’s-the-best-we-have approach to measuring teacher effectiveness is positioned as rational, with the questions around the reliability and validity of VAM minimized.

The language deployed within this debate is not used to engage in substantive discussion about what is being measured and how. Instead, language is being used to sensationalize the topic, with extreme-case examples often used to counter alternative perspectives. For instance, the lawyer representing the United Federation of Teachers constructed the release of value-added scores as a life-or-death scenario, being quoted in the Los Angeles Times as saying: “The city of L.A. did this and a teacher jumped off a bridge. Do we want that?” This not only functions to associate a tragedy with the VAM score release, but also positions those who favor such measurement as supporters of something that threatens the lives of teachers, thus adding urgency to and further polarizing the debate.

Another tactic, which has been taken up by New York news sources, is linking discussions of and references to tenure—a traditionally divisive topic—to discussions of publicly releasing teacher scores. Though peripherally related to notions of accountability, and ways of measuring “effectiveness,” tenure seems to be made relevant almost as a diversionary tactic. This may work to remind people which side they should be for and which they should be against.

We argue that the use or release of value-added-measurement scores does not have to be an issue of tenure, seniority, or job security. Perhaps in New York, there is still a small window of opportunity for a more intelligent conversation—one that puts VAM into context for its readers; one that allows counterarguments to add caution and clarity, not hype and mudslinging, to an already divided and politicized education community. Though it is easy to say everyone is united around the idea of the need for effective teachers to be in every classroom, the exaggerated importance of a single statistic as a means for assessing teachers may divide us once again when it comes to measuring and encouraging the kinds of teaching all students deserve. The way we choose to write and talk about VAM may make all the difference.

A version of this article appeared in the January 12, 2011 edition of Education Week


This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Student Well-Being Webinar
Equity, Care and Connection: New SEL Tools and Practices to Support Students and Adults
As school districts plan to welcome students back into buildings for the upcoming school year, this is the perfect time to take a hard look at both our practices and our systems to build a
Content provided by Panorama Education
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Classroom Technology Webinar
Here to Stay – Pandemic Lessons for EdTech in Future Development
What technology is needed in a post pandemic district? Learn how changes in education will impact development of new technologies.
Content provided by AWS
School & District Management Live Online Discussion A Seat at the Table: Strategies & Tips for Complex Decision-Making
Schools are working through the most disruptive period in the history of modern education, facing a pandemic, economic problems, social justice issues, and rapid technological change all at once. But even after the pandemic ends,

EdWeek Top School Jobs

Teacher Jobs
Search over ten thousand teaching jobs nationwide — elementary, middle, high school and more.
View Jobs
Principal Jobs
Find hundreds of jobs for principals, assistant principals, and other school leadership roles.
View Jobs
Administrator Jobs
Over a thousand district-level jobs: superintendents, directors, more.
View Jobs
Support Staff Jobs
Search thousands of jobs, from paraprofessionals to counselors and more.
View Jobs

Read Next

Assessment Opinion The National Assessment Governing Board’s Troubling Gag Order
NAGB's recently released restrictions on how its board members can communicate set a troubling precedent.
3 min read
Image shows a multi-tailed arrow hitting the bullseye of a target.
DigitalVision Vectors/Getty
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Assessment Whitepaper
INVALSI Addresses Italy’s COVID-19 Learning Loss
Find out how INVALSI worked with TAO to develop a plan of action that can serve as a model for other education leaders grappling with the...
Content provided by TAO by Open Assessment Technologies
Assessment Biden Administration's Level of Tolerance for Cutting Standardized Tests Comes Into Focus
A distinction has grown between states having to make tests available, and districts deciding it's not practical to make students take them.
8 min read
Image of a test sheet.
Assessment Opinion Alternatives to Standardized Tests During a Pandemic Year
Three educators suggest alternatives to federally mandated standardized testing during this year undercut by COVID-19.
7 min read
Images shows colorful speech bubbles that say "Q," "&," and "A."