Standards for Teacher Evaluation Mulled
With the pressure on to increase student learning, two states are in the process of overhauling what analysts say is among the most neglected pieces of the teacher-quality continuum: evaluation.
Both Georgia and Idaho are working to help districts institute performance-based teacher evaluations built on clear descriptions of effective teaching practices.
Neither state plans to require districts to use a specific evaluation instrument. Instead, officials expect the performance-based standards to improve consistency.
“There’s a desire among educators and policymakers to have consistent standards across the state,” Tom Luna, Idaho’s superintendent of schools, said in an interview. “If what a teacher is doing is determined to be good teaching in Boise, it should also be considered good teaching in Twin Falls.”
The process inevitably raises difficult questions. Policy experts, nevertheless, say the state momentum is overdue.
“It’s a logical and a belated step,” Thomas Toch, the co-director of the Washington-based research group Education Sector, said about the state activity. “It’s hard to believe that an industry that spends what we estimate to be $400 billion a year on teacher pay and benefits has such a flawed system of measuring the return on its investment.”
Mr. Toch, the co-author of a 2008 report on the state of teacher evaluations, contends that most evaluations are cursory and based on criteria that do not correlate with student achievement.
Teachers report similar stories, said a member of Idaho’s teacher-evaluation task force.
“I’ve been evaluated on a field trip to the farm. I’ve been evaluated when I was helping renovate the [playground] and was putting down new chip bark,” quipped Sherri Wood, the president of the Idaho Education Association and a 28-year teaching veteran.
A New Model
The lack of consistency within states appears at least in part to be due to a dearth of guidance.
In Georgia, teacher evaluation using the existing voluntary state model hinges primarily on each evaluator’s interpretation rather than clear standards, said Barbara Lunsford, the state manager for leader quality.
Aside from the arbitrariness, most evaluations are never used to help teachers improve their practice, according to Charlotte Danielson, a consultant on performance-based teacher evaluation. “Teacher evaluation is just something that everybody endures,” she said.
Performance-based evaluation frameworks, by contrast, are built on standards of teacher behavior that research links to improved student learning.
Each standard typically includes a description of the practice and examples of the evidence, such as lesson plans and student work, that evaluators are expected to seek in making judgments about teacher attainment of the standard.
Finally, they describe ascending levels of teacher performance based on the evidence collected.
Ms. Danielson’s model is one of the best-known and widely adopted examples. It incorporates standards across four domains: instruction, classroom environment, professional responsibilities, and planning and preparation.
The benefits of evaluating teachers against such a framework are not only that evaluators can give teachers specific feedback on their instructional successes and areas needing work, but also that they essentially become a form of professional development that helps teachers, over time, analyze their cognitive decisionmaking, Ms. Danielson said.
“It’s very rewarding to engage in conversations with a teacher about those decisions,” she said. “It becomes a kind of problem-solving exercise, not just a judgment.”
To scale up better evaluations, states need to take the lead role in setting and enforcing minimum teacher-performance standards, said Sandi Jacobs, the state-policy director for the National Council on Teacher Quality, a Washington-based group that promotes improvement in the profession.
Only 16 states set specific guidelines on evaluations, according to the council’s most recent tally.
“Multiple data points, a combination of observation and more-objective data—these things definitely matter,” Ms. Jacobs said. “And the idea that these kinds of policies are incompatible with local decisionmaking really isn’t so.”
Idaho is proof positive of that maxim. A longtime bastion of local control, the state until recently had little oversight of teacher evaluations aside from requiring districts to perform them once a year.
The notion of a uniform standard gained traction this year during legislative debate over proposed merit pay. That plan failed in the legislature, in part because teachers did not feel that districts had a consistent basis for evaluating effective teaching, said state schools Superintendent Luna.
The legislature then set up a task force to examine the issue and make recommendations by next year. The 22-member body has settled on Ms. Danielson’s framework as a foundation, Mr. Luna said.
Though its recommendations are still in the draft stage, the task force plans to require districts to align their evaluation instruments with the Danielson framework’s four domains. Beyond that, districts will be able to tailor the evaluation to their own needs, Mr. Luna said.
Georgia, unlike Idaho, has typically had stronger state control over education.
Considered innovative when first crafted in 2000, Georgia’s voluntary instrument was not aligned with the post-No Child Left Behind Act, standards-based classroom, said Ms. Lunsford, the manager of leader quality.
The new program, now being piloted in 180 schools, focuses on five “strands” similar to those in Ms. Danielson’s model. For each standard, teachers are scored on a four-tiered scale that represents growth over time, Ms. Lunsford said.
In one significant departure from the Danielson model, the Georgia program requires teachers to produce evidence of student-learning gains. It will leave it to districts and administrators to determine whether to base that evidence on test scores or other factors.
Both the Idaho standards and the new Georgia model require at least two full-period observations for new teachers, as well as pre- and post-evaluation conferences between teacher and evaluator. The systems, in effect, require a commitment by administrators to spend more time in classrooms, and hinge on training for the users of the new instruments.
Georgia’s new instrument comes with two days of training in which evaluators assess practices through videos and exercises.
The cost and provision of training in Idaho has not yet been fully settled, to the worry of the state teachers’ union.
“A more standardized state form won’t change anything unless there is training of teachers to understand the new form and the new way of being evaluated, and administrators are helped to use the new tool,” Ms. Wood said.
Several observers, such as Mr. Toch of Education Sector, have recommended that states train a corps of retired teachers, central-office personnel, and other officials to help principals with evaluations. The additional feedback would help increase the reliability of evaluations, he said.
Neither Georgia or Idaho has set policy to that end, state officials said, although Idaho is toying with the idea of allowing teachers to be evaluated by administrators outside of their schools.
One long-standing tension lingers in the new performance-based evaluations’ purpose: They are tools primarily meant to establish paths for teacher improvement, but for the weakest teachers, they can also contribute to dismissal.
In Idaho’s draft policy, for instance, teachers whose performance falls short in two of the four domains—or whose performance falls short in the same domain on two successive evaluations—will be deemed unsatisfactory.
The state role in teacher dismissal as a whole is one area of concern to Ms. Danielson, who doesn’t want it to trump the focus on sustained improvement.
“I just hope states are not completely motivated by the bad-apple impulses of the legislators,” Ms. Danielson said. “I’m not terribly optimistic about that, frankly.”
In Georgia, Ms. Lunsford said, evaluators are sensitive that few teachers are likely to score at the “exemplary” level for every component.
At a training session last week, she recounted, one 20-year veteran noted that her own teaching practices would be considered “emerging”—the second-lowest level out of four—on several of the new system’s subcomponents.
“Our response was,‘That’s fine—you’re not teaching the same way you were 20 years ago,’ ” Ms. Lunsford said. “A lot has changed.”
Vol. 28, Issue 06, Pages 18,21