Last week’s teacher effects brouhaha brings me back to where this blog started - not eduwonk channeling Britney, but rather how to measure teacher effectiveness. We know a lot more about estimating teacher effects on student test scores than we did 10 years ago. (Readers know well that I am as concerned with academic and social outcomes of education that are not measured by test scores, but that is for another post.) Nonetheless, big picture questions linger, and Mary Lou Retton-worthy technical gymnastics won’t make teachers feel comfortable with value-added until these questions are answered. Here’s what I’d like to know before moving forward:
1) How do schools affect teachers’ ability to be effective in the classroom? The current assumption about teacher effects is that they reside within the teacher - i.e that a teacher is “good” or “bad” independent of the school context in which s/he is working. But we don’t know if a teacher is equally effective across multiple schools, or if some component of a teacher’s effectiveness is “firm-specific.” For example, Harvard health economist Robert Huckman has examined doctors’ effectiveness across hospitals and found that human capital isn’t entirely portable. (The Effect of Organizational Context on Individual Performance). Is this also true in education?
2) How, and how much, do colleagues matter? Having higher quality colleagues may make you a better teacher yourself. We need to know whether “teacher peer effects” exist, and if so, how important they are. (For more, see No Teacher is An Island). Colleagues matter in a second way in middle and high school, where kids have different teachers for different subjects. That your students have an exceptional English teacher makes it easier for your kids to write lab reports in science, and prior year teachers may matter as well. We need to know how these crossover effects operate, and how large they are.
3) Are the same teachers that are effective in promoting short-term score gains effective in promoting longer term academic growth?: We currently estimate teacher effects on what happens on a year-end test - but what we’re really after is teachers’ long-term effects on their students. We’re not interested in short-term score inflation, but in improved learning that lasts. (See this New Yorker article about the trouble with hedge fund bonuses.) A new paper, “How Long Do Teacher Effects Persist?” by Spyros Konstantopoulos provides some insight here.
4) Are the same teachers that are good at promoting math skills good at promoting reading skills? Does being an “effective teacher” mean that you are good at one or good at both? Current estimates of the correlation between teachers’ math and reading effects are in the neighborhood of .50-.60.
5) How large are student peer effects, and how does the existence of peer effects complicate our ability to estimate teacher effects? Classrooms are interactive organisms, not individuals sitting in separate cells. Teachers are well aware of this fact, and talk about classes from hell/heaven. Peer effects can be random - i.e. a couple of kids who chemically react and pull the class down with them - or socially patterned. For example, classes with a higher proportion of girls result in both girls and boys performing better (See More Girls=More Learning). How should our knowledge of peer effects in the classroom affect the way we model teacher effects?
6) What about non-random assignment? Non-random assignment may be the biggest threat to value-added systems. (See The Great Sorting Machine for more.) It’s important from a technical perspective (see Do Value-Added Estimates Add Value?), but also from a legitimacy perspective. Teachers know that principals can bury them by sticking them with tough kids.
7) Are all gains created equal?: Should gains for high performers be treated differently than gains for low performers? In other words, should a gain of 10 scale score points for a high scoring kid be treated the same as a gain of 10 points for a low scoring kid?
Why do these big picture questions matter? Each has modeling implications. More importantly, they matter because teachers have these concerns about value-added estimates and they deserve to have their questions answered. From following the use of value-added in Dallas at the Dallas ISD Blog, it appears that few teachers actually understand how their CEI scores are calculated. Researchers and wonks interested in trying value-added need to do a better job of explaining these systems to teachers, of making them comprehensible, and of addressing concerns like those raised above.
My one line position on value-added? It’s not ready.