Over at the Ed Sector, there’s some confusion about my concern with the ethics of the NYC teacher experiment (see here). To be clear, my problem is not that NYC is collecting value-added data. As I have written before, standardized tests have a role to play in teacher assessment alongside holistic evaluation of teachers’ effectiveness. But as eduwonk himself noted, the methodological issues are hairy and as of yet unresolved.
The concern expressed in my earlier post was how this experiment was conducted in secret and, in my opinion, in violation of generally accepted human subjects policies. The entire enterprise of social science relies on potential study participants trusting researchers to minimize risks and fully disclose the purpose of their study. Every time a gaff like this happens, it undermines researchers’ ability to build trust with study participants in the future. Let’s review the chronology:
1) In September, an academic experiment headed by two very talented researchers, Jonah Rockoff (Columbia Business School) and Tom Kane (Harvard Grad School of Ed), was announced. It was presented as an experiment intended to generate academic knowledge, not to inform human resources decisions in real time. (You can watch a video of a study recruitment session here.)
2) Academic research is bound not only by common sense research ethics, but by the conventions of university Institutional Review Boards. What this means is that when academic researchers conduct research intended to produce generalizable knowledge - i.e. if researchers want to publish off of these data - the experiment has to proceed within generally accepted research ethics and a university IRB has to approve it. (Even if this was not an academic research project, the DOE should have notified teachers of an intervention of potential consequence for them. After all, the data are not just being collected, but distributed to principals in the experiment’s treatment group.)
IRBs are primarily concerned with the harm that researchers could do to subjects by intervening in their lives, and applicants to IRBs must demonstrate that their project poses minimal risks, that participants have been notified of these risks, and that participants have consented to the research. Teachers did not need to consent in this case, as they are government employees and their employers can collect whatever data they want.
However, it is difficult for me to understand how one could justify not notifying teachers in the study. After all, the information given to their principal - which, given the ongoing methodological problems with value-added, may or may not be accurate - has the potential to permanently change their principals’ perceptions of them and their future employment prospects. Moreover, this treatment is not being applied universally to NYC teachers. By simply having the bad luck to be selected into the study’s treatment condition, some teachers are affected and others are not.
It is important to note that a “live experimental” study like this one is different from the secondary data analysis studies that eduwonk cites. He wrote:
By that logic, all these various studies with panel data, choice studies using lotteries, etc...all constitute human experimentation and are wrong.
Studies based on secondary data analysis are fundamentally different - and are treated differently by IRBs - because researchers are analyzing “dead” data that have no effect on real people’s lives. Ongoing research projects in which interventions are made in real people’s lives are held to a different standard. And should be.
3)According to Edwize and the NYT article, teachers were not notified of the study. What went wrong is that at some point this went from an academic study to a human resources project that Chris Cerf wants to take prime time. Perhaps he mispoke, or the NYT article had this wrong, but it appears that these data, collected under the auspices of an academic research study, may be used as early as June. As eduwonk noted, simply gathering the data is not a problem. The problem is that under the cover of “academic research,” data are being given to princpals in ways that affect teachers’ future employment without teachers’ knowledge.
The irony, of course, is that none of this would be a big deal if the project had been announced to teachers. When I watched the recruitment session video back in September, it didn’t seem like a big deal at all. I bookmarked that this was an interesting experiment conducted by two reseachers whose work is first rate, and assumed that the experiment would proceed under normal conditions (i.e. full disclosure of the study). For reasons I don’t fully understand, it didn’t. And here we are.
There’s much more to say about the methodological and broader philsophical issues with value-added measures. I’ll follow up with a post on these issues later.
Update: eduwonk and I continue our bridging differences exercise. He wrote:
Her position here would be a lot more compelling if (a) this were an actual experiment in the way she and other anti-Klein partisans are seeking to describe it rather than what it is. In addition --and again-- the fact is that we don’t know what they are doing with the data so at this point all these leaps to various consequences are unfounded.
But we do know what they are doing with the information, at least in the context of this experiment (and, as I have explained above, it is an experiment). Principals in the treatment group are given value-added data reports on each of their teachers. These principals’ perceptions of teachers’ academic effectiveness are thus affected - correctly or incorrectly - by this information. Saying “principals can’t use it” is like trying to strike evidence from the record in a courtroom. Jurors’ perceptions are already influenced, and the damage is done.