I’ve been preoccupied the past few weeks while the Gates Foundation issued its big, final Measures of Effective Teaching report. A lot has already been said summarizing and discussing the findings. No need for me rehash all that. Instead, thought I’d just take a moment to offer three thoughts on the whole thing.
First, as much fun as it is to pick on Gates, I found MET a pretty exemplary case of strategic foundation investment in schooling (Full disclosure: Gates is, of course, a major funder of mine.) The foundation didn’t get bored or embrace a new fad midway, paid close attention to research, showed patience, and plunked down hundreds of millions in a creative and coordinated manner that would’ve been much tougher (and unaffordable) for IES or NSF. It prompted several heralded avatars of teacher observation to test their mettle by comparing their evaluations against one measure of student learning. We now know a ton more about value-added, student surveys, and teacher observations than we did five years ago. Moreover, the residue of this vast private investment is a trove of data, thousands of videos of classrooms available for analysis, and a body of basic research that nobody else was going to provide.
Second, the hundreds of millions spent on MET were funny in that, on the one hand, this was a giant exercise in trying to demonstrate that good teachers can be identified and that they make a difference for kids. I mean, I always kind of assumed this (otherwise, it’s not real clear why we’ve been worrying about “teacher quality.”) So it was moderately amusing to see the MET researchers make a big deal out of the fact that random classroom assignment finally allowed them to say that high value-added teachers “cause” student test scores to increase. I’m like, “Not so surprising.” At the same time, this is too glib. Documenting the obvious can have an enormous impact on public understanding and policy, especially when questions are fraught. Moreover, I’ve long wondered about the stability of teacher value-added results over time, so I was struck that the third-year randomization showed earlier value-added scores to be fairly more predictive than one might’ve thought.
Finally, as I’ve noted before, the whole fascination with MET and its would-be imitators brings to mind the problems with collateralized debt obligations (CDOs) that helped drive the housing bubble and subsequent crash. In that case, a whole bunch of highly-paid, brilliant quantitative analysts at a number of firms built complex models premised on the assumption that decades of housing data accurately and completely reflected what might happen in the future. They built in hedges, margins of safety, and all the rest, but all of these were hinged on the old data. When those analyses led lenders and investors to behave in ways that were no longer congruent with the underlying data, a mess ensued.
Well, my biggest concern about MET is that the utility of value-added, student surveys, and teacher observations is judged solely by its efficacy in predicting value-added growth in reading and math scores. Now, the Gates folks are aware of this limitation, but the best they were able to do was show that their measures helped predict performance on reading and math tests other than the state assessments (specifically the SAT-9 and the Balanced Assessment in Mathematics).
Look, I have no problem with asserting that reading and math value-add is one measure of good teaching (and, quite frankly, I think it’s probably a big piece in most of the MET districts-- and a much smaller piece in school systems where basic skills are less of a pressing concern). But I do think it’s a mistake to imagine that ability to move reading and math scores is universally a compelling proxy for being a “good” teacher. And when we calibrate all of our other instruments based on their ability to predict value-added gains on reading and math assessments, we build our entire edifice of teacher quality on what strikes me as a narrow and potentially rickety foundation. When we see policymakers mandate teacher evaluation systems that rely almost wholly on observation and value-added, and feel comfortable in doing so because of the MET findings, I fear we’re getting way ahead of ourselves.
MET has made an enormously valuable contribution. Even when the results are mundane, they’re useful. After all, the finding that nothing predicts value-added scores nearly as well as value-added scores shouldn’t unduly surprise, nor should the sparse evidence on the value of observational protocols (much like professional development, I think observation has long been more impressive in concept than practice.) But, more than anything, I hope that we resist the temptation to narrow our conception of good teaching to a handful of things we can conveniently measure, and instead make smart use of the MET findings while also seeking ways to more robustly gauge teacher performance.