Two weeks ago, New York state’s highest court ruled that the New York City Department of Education could release for public scrutiny the value-added ratings of teachers of mathematics and English in grades 4-8. Rupert Murdoch’s New York Post, joined by other media, had filed a “freedom of information” request to obtain the testing data, and the United Federation of Teachers opposed their release, saying that the ratings contained many inaccuracies.
According to The New York Times, current schools Chancellor Dennis Walcott had “mixed feelings” about the naming of names, but his predecessor, Joel Klein, had “championed” their release. A story in the Columbia Journalism Review said that the city’s department of education had encouraged reporters to file “freedom of information” requests and responded with uncustomary speed when the requests were received.
The scores were released to the public last Friday.
Just for the record, it should be noted that on Oct. 1, 2008, then-president of the UFT Randi Weingarten signed a joint letter with then-Chancellor Klein that said: “We wish to be clear on one point: the Teacher Data Reports are not to be used for evaluation purposes.” On the same day, then-Deputy Chancellor Christopher Cerf (now acting commissioner of education in New Jersey) wrote a letter to Weingarten saying:
“1. It is DOE’s firm position and expectation that Teacher Data Reports will not and should not be disclosed or shared outside the school community, defined to include administrators, coaches, mentors, and other professional colleagues authorized by the teacher in question.
2. We will advise Principals to take steps accordingly.
3. In the event a [freedom of information] request for such documents is made, we will work with the UFT to craft the best legal arguments available to the effect that such documents fall within an exemption from disclosure.”
Clearly, the New York City Department of Education honored neither the letter nor the spirit of the promises it made in 2008.
While it has become customary to pay homage to the necessity of “multiple measures” when evaluating teachers, the New York City teacher data reports rely on only one measure: the scores of students on standardized tests in reading and mathematics, taken on one day each year for three consecutive years.
Most testing experts believe that value-added assessment has many technical problems that reduce its validity and reliability. The most recent research review appears in the current issue of the Phi Delta Kappan. Unfortunately, advocates of measuring teacher quality by student test scores never let research or evidence or, in New York City’s case, unequivocal commitments to privacy, get in their way.
Now comes the next act of this sordid drama. As soon as New York City made public the ratings of thousands of teachers, their names and scores were promptly published by the New York Post and posted online by The New York Times and other media outlets. According to the New York Post, the list included 12,170 names, but the Times says it was “roughly 18,000.”
Only one New York City media organization, gothamschools.org, had the integrity to refuse to name names. Journalists Elizabeth Green and Philissa Cramer explained: “We determined that the data were flawed, that the public might easily be misled by the ratings, and that no amount of context could justify attaching teachers’ names to the statistics. When the city released the reports, we decided, we would write about them, and maybe even release Excel files with names wiped out. But we would not enable our readers to generate lists of the city’s ‘best’ and ‘worst’ teachers or to search for individual teachers at all.”
City officials lamely warned that no one should “draw a conclusion based on this score alone,” but their plea predictably fell on deaf ears.
The New York Post exulted with a front-page, full-page banner headline: “REVEALED: TEACHER GRADES.” On day one, it printed a picture of and story about “the best teacher,” and on day two, a picture of and story about “the worst teacher.” The Post interviewed parents who said they wanted their child out of that teacher’s class or they wanted her fired. In recent years, the Post has often run stories about teachers who allegedly are criminals, perverts, or just plain lazy, greedy dummies who can’t be trusted to teach anything and shouldn’t be allowed near children. It seems that the Murdoch journal won’t be satisfied until every school has been turned over to private management, with no unions, no seniority, and no job protections whatever for teachers.
In its coverage, the august New York Times sent a mixed message. Its first headline said “Teacher Quality Widely Diffused, Ratings Indicate,” which implied that the ratings actually measured teacher quality and meant something real and important. And indeed, the first three paragraphs stated that every corner of the city, from the poorest to the most affluent districts, had teachers who were the most and least successful.
But the fourth paragraph of the Times story revealed the statistical inadequacy of the measures: "... the margin of error is so wide that the average confidence interval around each rating spanned 35 percentiles in math and 53 in English, the city said. Some teachers were judged on as few as 10 students.” With such a large margin of error, it’s hard to know how anyone could take these ratings seriously. The precise numbers attached to each teacher’s name are nothing more than junk science.
The ratings were based on state tests from 2007-2010. In the delicate words of the Times, the state tests from 2007-2009 had been “somewhat discredited.” The state education department acknowledged in 2010 that the scores for several years prior to that date were unreliable and inflated. Someone in the state education department had lowered the passing mark in some grades year after year to create the illusion of progress. That someone was never identified and never held accountable for having misled students, teachers, and parents. The state recalibrated scores across the state in 2010, which caused scores to fall in every district and, coincidentally, put an end to the New York City “miracle.” But those “somewhat discredited” scores for 2007-2009 are now accepted as the foundation of the city’s value-added ratings.
In another story on the same day, the Times showed just how arbitrary the scores are. It reported that quite a few teachers in the city’s most coveted public schools had low scores. Their scores were low not because they were bad teachers, but because of the city’s methodology, which graded teachers on a curve and compared teachers with others with similar students. Suddenly, even the best schools started looking like undesirable schools.
As educators spoke up, the ratings became even more problematic. One teacher said to the Times, “This data is based on ONE test taken on ONE day. ...Yes, I administered this test that generated this data to my 6th graders two years ago. I no longer teach 6th grade, and I no longer teach in the same school, or even the same subject. How is this data relevant today?”
A junior high school teacher told the Times that he had received only two rankings, 88 percent in one year, and 38 percent in the next, yet his rating was averaged as 40 percent. He noted that his score included students who had transferred in or out mid-year as well as students who were truants and rarely attended class at all.
The principal of an outstanding elementary school in Brooklyn wrote me to say that the release of the ratings made her “absolutely sick.” One of her teachers was rated for a year when she was away on child-care leave. A teacher of gifted children got a very low rating because her students’ scores went from 3.97 to 3.92. That change of five/one-hundredths of a point caused the teacher to rank in the lowest sixth percentile citywide. The principal said this teacher is one of her best, yet she has been publicly labeled one of the city’s worst teachers.
The principal said that the public ratings are very demoralizing to all her teachers, because they are so arbitrary. Even the best teachers wonder if their heads will be on the chopping block next year.
The day before the scores were released, Bill Gates published an opinion piece in the Times opposing their release. He said that teacher evaluations should not be made public because doing so make it impossible for supervisors and employees to have an honest conversation about how to improve. I only wish he had published his views weeks or months ago, or in 2010, when The Los Angeles Times initiated this practice. Perhaps a phone call by Bill Gates to New York City Mayor Michael Bloomberg would have made a difference, since he doesn’t listen to parents or teachers.
Gates raises an important question: What is the point of evaluations? Shaming employees or helping them improve? In New York City, as in Los Angeles in 2010, it’s hard to imagine that the publication of the ratings—with all their inaccuracies and errors—will result in anything other than embarrassing and humiliating teachers. No one will be a better teacher because of these actions. Some will leave this disrespected profession—which is daily losing the trappings of professionalism, the autonomy requisite to be considered a profession. Some will think twice about becoming a teacher. And children will lose the good teachers, the confident teachers, the energetic and creative teachers, they need.
Impelled by Race to the Top and Secretary of Education Arne Duncan’s No Child Left Behind waivers, teacher value-added ratings are rapidly spreading to other districts and states. And in these many other districts and states, the media will file requests for release of these ratings. When it first happened in Los Angeles in 2010, Arne Duncan said this was a wonderful thing, that “Silence is not an option.” He also said, “What’s there to hide?” And he said that parents have a right to know if their teacher is effective. Wherever there are value-added ratings, you can be sure that there will be public disclosure of those ratings to the media.
Interesting that teaching is the only profession where job ratings, no matter how inaccurate, are published in the news media. Will we soon see similar evaluations of police officers and firefighters, legislators and reporters? Interesting, too, that no other nation does this to its teachers. Of course, when teachers are graded on a curve, 50 percent will be in the bottom half, and 25 percent in the bottom quartile.
Is this just another ploy to undermine public confidence in public education?
If ever we get past this terrible time of teacher-bashing and blame-shifting, Arne Duncan and his ignominious Race to the Top have a lot to answer for. And so will the irresponsible leadership of the New York City public schools, which cares so little for the morale and spirit of those whom they presumably lead.
The opinions expressed in Bridging Differences are strictly those of the author(s) and do not reflect the opinions or endorsement of Editorial Projects in Education, or any of its publications.