This is the gist of an exchange between left of center eduwonkette and right of center Jay P. Greene over the Manhattan Institute’s release of Building on the Basis: The Impact of High-Stakes Testing on Student Proficiency in Low-Stakes Subjects by Greene, his long-time associate Marcus Winters, and Julie Trivitt.
It may be an elegantly executed study, or it may be a terrible study. The trouble is that based on the embargoed version released to the press, on which many a news article will appear today, it’s impossible to tell. There is a technical appendix, but that wasn’t provided up front to the press with the glossy embargoed study….
By the time the study’s main findings already have been widely disseminated, some sucker with expertise in regression discontinuity may find a mistake while combing through that appendix, one that could alter the results of the study. But the news cycle will have moved on by then. Good luck interesting a reporter in that story.
Consumers of this information are generally aware of these trade-offs and assign higher levels of confidence to research as it receives more review, but they appreciate being able to receive more of it sooner with less review….
In short, I see no problem with research becoming public with little or no review.
I find Greene ‘s final statement statement about education research both incredible and unsurprising. This week, incredible. Next week, unsurprising.
This Week: Incredible, in that it’s wrong on its face. When research based on quantitative analysis is offered to the media “with little or no review,” its quality does not improve, it suffers. That counts as a problem.
The reasons are not hard to discern.
Lack of Transparency. Education policy research is intended to prompt the decisions and actions of education policy makers. When researchers release their findings, they are at least suggesting that policymakers - and those of us who influence policy makers, can rely on the accuracy of their studies’ underlying methodology and the evidence it yields.
Where that research is qualitative, method and evidence are generally transparent – almost always integral to the written report. Specialized expertise may well add considerable value to the interpretation of such information – this writer has certainly made such arguments in the case of reviews of No Child Left Behind’s Reading First and waiver provisions. Regardless, qualitative research is accessible to the critical review of informed citizens.
This is simply not true of quantitative research. The method and evidence are briefly summarized in the text; technical data is asserted in tables and figures. The explanations of both are placed in a technical appendix or made available on request.
Ignoring Human Behavior. Specialized expertise is absolutely necessary to interpret the data used and created by quantitative methods and assess its validity. Everyone without the skill set is pretty much required to take the research on faith. The idea that “(c)onsumers of this information are generally aware of these trade-offs (between timeliness and reliability) and assign higher levels of confidence to research as it receives more review” may be true where the consumers are peers. In my own experience, the idea that it applies to the rest of the world amounts to wishful thinking at best.
Indeed as a market advocate, I’m sorry to say that Greene’s statement is a good example of traditional pro-market education policy wonks overconfidence in the translation of theory to practice. Market theory assumes “perfect information” - that everyone knows everything important about the factors influencing decisions to buy and sell. Market practice is based on having information superior to your competitors, and better than the party on the other side of your transactions.
At least as important, decades of cognitive research have demonstrated and confirmed that human beings tend to choose evidence that bolsters preexisting beliefs, and resolve the dissonance contrary evidence engenders by rejecting it outright or explaining it away. This large body of work not only argues for peer review because it might catch such errors on the part of researchers, it tends to disprove the idea that non-experts will view research findings with a critical eye.
As a market advocate, I find in this the basic argument for government regulation of markets, and the value of professional practices aimed at assuring consumers of some basic level of quality. Caveat emptor can never be repealed, but surely education researchers should think less of it than say the manufacturers of house paint. Let me go one step further, when consumers are left to determine the basic quality of quantitative research unaided by prior review, why shouldn’t we consider such work to be the intellectual equivalent of patent medicine?
Peer Review. Whether their methods are qualitative or quantitative, education policy researchers bear a burden of proof. Particularly in the case of quantitative research, professional norms demand that the first step in meeting this burden is subjecting their findings to the serious review of one or more disinterested peers prior to publication.
No means of quality control is perfect, but well-designed peer review processes offer the consumers of research some assurance of validity. They help to identify flaws in method, data and logic; point out areas where more explanation is required, and suggest alternative conclusions. At least as important, they allow discussions/negotiations regarding corrections to remain behind closed doors – where authors are more likely to be concerned about getting it right than saving face.
The better the review process, the greater our assurance that problems of data, method and reasoning have been ironed out. My own experience at RAND may be instructive. Typically one or two peers were selected or approved by the program director managing the area where the author worked. In the case of qualitative research, reviewers were given one or two days of time to examine the work and comment. In the case of quantitative work, two to three days was more likely, and reviewers were often given additional time if required.
In most qualitative research, reviews tend in the direction of “I know good work when I see it.” Frankly, it’s a tough job for program directors to get real expertise and experience in the subject matter of policy free of political bias. Assuring balance is the program director’s job.
In quantitative work, reviewers assess the advantages and shortcomings of the data available for the analysis and/or the approach to collecting new data; the applicability of various methodological tools employed by the authors, the calculations made to produce the quantitative findings, and the inferences drawn from those findings - all in the context of a given research budget. Surely various forms of reviewer bias can come into play but the quantitative review is a lot easier for the program manager trying to decide whether a study is ready for prime time.
Formal review of quantitative analysis was particularly important at RANDD where the authors may have had expertise in such tools - even degrees where such expertise was required, but lacked academic credentials in the quantitative methods as applied to the field of study under examination. A doctorate in economics may require knowledge of the same quantitative routines as a Ed.D. in Quantitative Policy Analysis, but a person with the latter has more training in their application to education. in It’s not so much that a little bit of knowledge is a dangerous thing, as the fact that such a review adds substantially to a program director’s confidence in the methodology selected by the authors. (On this, see paragraph three here.)
With qualitative research, the most egregious errors will be caught, but even the best review process leaves beauty to the eyes of the beholder. In this world, that’s the best we can do. The validity of quantitative research released following a solid review is something policymakers can reasonably rely upon. The validity of quantitative research released with little or no review is a matter of reasonable doubt.
In any case, my experience as a reviewer, reviewee and project manager involved in qualitiative and quantitative research at RAND convinced me that - at a minimum - the prospect of a formal review by one’s peers does tend to concentrate the mind in ways that help prospective authors remain in the vicinity of the straight and narrow path towards sound analysis.
Publication Prior to Review. I noted above that the overwhelming majority of readers who consume education research work are simply not equipped to assess quantitative analysis. My client publication School Improvement Industry Week provides summaries of program evaluations in each issue, and I include excerpts from accessible work on evaluation methodology regularly. After four years of doing so, I still fear that acquiring a working knowledge of method sufficient to judge the quality of quantitative research is beyond the time and attention of most people working in school improvement.
It is equally unrealistic to trust the screening function to reporters. Given the news cycle (yes, education media have one) even if researchers make all the relevant data available, reporters simply don’t have time to find a competent expert to verify the work. Moreover, validity is just one of many considerations that go into an editorial decision to publish something about a study, and probably not the most important. Reader interest in the topic, the relevance of the research to policy, the researchers’ reputations as media personalities are also important (yes, the education media has its own personalities).
Studies released with “little or no review” tend to attract the interest of ideological and peer rivals. The likelihood of controversy is always a factor favorable to reporting. But when peer review becomes a post hoc, public process, where self-appointed reviewers tangle with rival authors, it’s hard to argue that we’re improving on the system of peer review prior to publication. Indeed partisan “expert” debate is bound to undermine whatever relevance the work might have had to policymaking prior to publication. The result is almost always reduced a “he said, she said” story.
If it’s reasonable to assert that education policy research, especially quantitative research, is likely to be improved by review prior to publication and just as likely to be undermined when review is postponed until afterwards, why isn’t all such research subjected to peer review?
I can conceive of four potentially reasonable arguments to forgo peer review. None apply to policy research in public education.
High Cost. Peer review can be expensive, but more often it is quite reasonable. At RAND, where the reviewer is likely to be a staff member whose fully loaded day rate is probably over one thousand dollars, review costs ran from $2-8,000. Outside reviews were comparable in cost, if not lower, and almost always received before their in-house counterparts. Some particularly credible specialist might charge $2000 per day, but for most work a competent assistant professor could be hired to do a complete review for $500.
Since I left RAND, I have been paid as little as $250 and as much as $2000 to review qualitative education research. While COO at New American Schools, I authorized payments of as much as $5000 to see quantitative education research produced by parties other than RAND reviewed by a reasonably well-regarded full professor on a very short timeline. I would estimate that most of today’s quantitative research in k-12 education could be reviewed competently at the assistant professor level for $1-2,000. In short, for most education policy research organizations, the cost of a plausible peer review is well within reach.
Competent Reviewers. Perhaps everyone can’t hire someone like Dan Koretz at Harvard to review their quantitative analysis, but every such notable can recommend a half-dozen assistant professors. With a modest effort, competent disinterested reviewers can be found.
Tight Timelines. I know that at RAND, researchers engaged in quantitative work often had a contractual obligation to produce a report by some date, but very few of these dates were established to support a specific decision, and very few of these studies were considered necessary to specific decisions. The exceptions were a handful of studies done for the Defense Department. In these cases, reviews were obtained on time, even if that cost extra. Aside from RAND’s work for New American Schools, which supported board decisions to continue investment or drop Design Teams, I have yet to see a work of quantitative research drive a single specific decision in education policy. In short, the overwhelming majority of education research can wait for peer review – there’s no policy train about to leave the station.
Peer Recognition. If you fear someone is going to publish something like your research before you, and you want to get the credit, there is a temptation to preempt. It may not be playing by the rules; it may or may not be fair in some larger context; but in any case it rarely applies to education. Education research findings do not often raise to the equivalent of finding a planet, discovering a particle, inventing a vaccine, cloning an animal, or mapping the human genome.
Two unreasonable arguments to forgo review come to mind. Researchers may not want their work to be subject to review prior to publication because they fear it might not be published sufficiently close to “as is” to support their conclusions and recommendations. Alternatively, they might be “above the law” in fact - subject to no real world penalties for disregarding professional norms.
Next Week: Why I find Greene ‘s statement about education research unsurprising.
Those of us involved deeply in policy analysis hold the validity of research findings in the highest esteem. But in the larger world of people interested in public education, including generalist policymakers, validity is less important than credibility. When something makes the papers, it becomes credible. Education researchers understand this, which is why they maintain relations with the press. Those who seek to influence policy always feel the tug of media credibility and, in its pursuit, some researchers will be prepared to sacrifice validity.
Marc Dean Millot is the editor of School Improvement Industry Week and K-12 Leads and Youth Service Markets Report. His firm provides independent information and advisory services to business, government and research organizations in public education.
The opinions expressed in edbizbuzz are strictly those of the author(s) and do not reflect the opinions or endorsement of Editorial Projects in Education, or any of its publications.