I am sure you were as surprised as I was to read the headlines in the newspapers saying that Randi Weingarten proposed that teachers should be evaluated by their students’ test scores. This is a contentious issue. In New York, at Randi’s urging, the state legislature passed a law preventing districts from doing exactly this. Now, to qualify for the so-called Race to the Top, the state must roll back this legislation.
We know the downside of evaluating teachers by student scores. It is neither a fair nor an accurate way to judge teachers, and it produces unintended negative consequences. It compels teachers to teach to the test. This in turn narrows the curriculum to what is tested. As Secretary of Education Arne Duncan has acknowledged, the current tests should be replaced by better tests; why then use them for high-stakes decisions? If there is one principle on which all testing companies are agreed, it is that tests should be used only for the purposes for which they are intended. A test of 4th grade reading tests the reading ability of a student in 4th grade, not the ability of the teacher. There is a plenitude of research demonstrating that value-added assessment is not ready for prime time. Those who defend it should look at the NAEP scores of Tennessee, where value-added assessment has been used for many years. Tennessee has remarkably high state scores, but has made little, if any, improvement on NAEP. No value-added improvement there, despite years of implementation.
If anyone needs evidence of curriculum narrowing, go to the New York City Parent Blog where Steve Koss writes about the sharp decline in the number of New York City public school students who qualified as semifinalists in the Intel Science Talent Search contest. In 2002—before the double whammy of NCLB and mayoral control—New York City averaged 46 Intel semifinalists every year. But last week, when the winners were announced, the city’s public schools had only 15 semifinalists! In our new age of data-driven instruction, science doesn’t matter anymore, just reading and math, because they are tested, and science is not.
As it happens, Randi did not propose what was widely reported in the press. When I went to the AFT Web site and read her remarks, I discovered that she actually made a very thoughtful proposal. It will not satisfy the simplistic demand to fire teachers if their students’ scores don’t go up. But she made some very good points, and I wonder how many districts will be willing to take her advice; it won’t be easy or inexpensive.
Before they can fairly evaluate teachers, she said, states must adopt professional standards that spell out “what teachers should know and be able to do,” so teachers know what they will be held accountable for. Then, the evaluation itself should use “multiple means,” including “classroom observations, self-evaluations, portfolio reviews, appraisal of lesson plans, and all the other tools we use to measure student learning—written work, performances, presentations, and projects...” Then she says, “Student test scores based on valid and reliable assessments should ALSO be considered—NOT by comparing the scores of last year’s students with the scores of this year’s students, but by assessing whether a teacher’s students show real growth while in his classroom.”
She adds that principals and superintendents should be held accountable for successful implementation of the evaluation plan. She also insisted that the purpose of evaluation is to help teachers, not just to judge them, so every district should have a program to “support and nurture teacher growth,” including mentoring, ongoing professional development, and career opportunities that “keep great teachers in the classroom.”
Randi said that the AFT would work with any district that was prepared to commit itself to implement an evaluation system with these components and a due process system aligned to it. To say, as the headlines did, that Randi supports evaluating teachers by student scores is misleading.
How many districts now test students at the beginning and end of each school year, as Randi proposes? My guess is that most schools follow the NCLB template, testing each cohort once a year for purposes of comparison. This leads to comparisons of unlike groups and a lot of useless data. Randi’s recommendation might lead to more testing, but at least it would show a student’s progress over the course of a year in the same classroom.
What do you think?
The opinions expressed in Bridging Differences are strictly those of the author(s) and do not reflect the opinions or endorsement of Editorial Projects in Education, or any of its publications.