From school turnaround to family engagement to new teacher mentoring, the Every Student Succeeds Act gives states greater flexibility and a more nuanced approach to using research to guide programs and policy.
Still, it remains to be seen whether states have the capacity to shoulder greater responsibility for deciding what constitutes solid evidence in education.
While the No Child Left Behind Act famously, and earlier reauthorization attempts spoke of “scientifically valid” research, ESSA calls simply for programs and interventions to be “evidence-based.” It also leaves most of the responsibility to states to decide the quality of that evidence.
“I think this is actually great opportunity, because every state is going to need to think about what evidence-based means for them,” said Carrie Conaway, an associate commissioner for planning, research and delivery systems for Massachusetts’ education department. State officials and researchers, she said, need more conversations “to think about what strong evidence looks like—and where we do and do not have strong evidence. There are plenty of places where programs funded by ESSA that require evidence now, where we don’t currently have great evidence on what does and doesn’t work.”
For example, Conaway noted that school turnaround is a critical piece of ESSA—which requires states to focus on their lowest-performing 5 percent of schools—but there are only a handful of improvement models that have been. “That’s a really nascent area of research. The good news is people are starting to build evidence there, but it’s an area where we don’t know a lot about what works and why.”
From Gold Standard to Tiers
Under ESSA, any strategy or intervention can be considered evidence-based if it shows a statistically significant effect on improving student outcomes or other relevant outcomes. The definition actually encompasses four separate standards of research. The first three tiers, modeled on the Investing in Innovation, or i3, grant program, are required for the law’s accountability-related school improvement programs:
- Strong evidence includes at least one well-designed and -implemented experimental study, meaning a randomized controlled trial. This is the highest bar, and very few school improvement models have actually met it so far.
- Moderate evidence includes at least one well-designed and -implemented quasi-experimental study. For example, a program evaluation could use a regression discontinuity analysis, in which researchers might look at differences in outcomes for students who scored a point above and below the entrance cutoff score for a particular program or intervention.
- Promising evidence includes at least one well-designed and -implemented correlational study that controls for selection bias, the potential differences between the types of students who choose to participate in a particular program and those who don’t.
The tiers mirror evidence structures that are already required in other IES research grants and in several other agencies. Felice Levine, the executive director of the American Educational Research Association, argued that the Education Department should base ESSA guidance to states on a framework for judging evidence quality that has been developed by the IES and the National Science Foundation.
“I think Congress did a nice job of transitioning from scientifically based research that was ultimately more than what was needed in NCLB to giving states the opportunity to look at what they are going to do under the tiered approach,” said Paul Kimmelman, a senior education adviser for the American Institutes of Research and a longtime federal education policy watcher. I have confidence that the state education chiefs will try to use top-quality evidence to make decisions.”
Grover “Russ” Whitehurst, a Brookings Institution fellow and the first federal Institute of Education Sciences director, said it makes sense for Congress to take a step back from requiring specific types of research in ESSA. NCLB erred, Whitehurst said, by “demanding the use of [scientifically based research] when it didn’t exist ... [and] getting Congress into the business of defining SBR in education,” Whitehurst said. “Perhaps these were justified at the time because the state of education research was really awful. But neither makes sense today.”
Accountability and Training
The law, however, doesn’t actually make a distinction among the three levels—all of them are considered “evidence,” but states make the ultimate decision of how to use the tiers. Moreover, everything else in the law that must be evidence-based can use those tiers—but if there is nothing that meets any of the levels, there’s a fourth tier. If a state or provider can show that a program’s rationale is based on high-quality research or a positive evaluation that suggests it is likely to improve student or other important outcomes, that is enough, as long as the program or intervention has ongoing self-evaluation as well.
Massachusetts’ Conaway praised the fourth tier for making it easier to test and evaluate promising interventions and programs—“I think it will really help us build good research if it’s used well,” she said—but there’s also little in ESSA that the Education Department can use to hold states accountable for not scrutinizing evidence. Some states, like California, have their own laws in place requiring educators to use “current and confirmed research” in areas like reading instruction, but others will be starting from scratch.
“I’m a little skeptical of how this will work,” said Johannes Bos, a senior vice president at the AIR and an expert in randomized controlled trials. “It’s difficult to do well and easy to cherry-pick evidence to support what you want to support. There will have to be a lot of discipline and a lot of review and training and surrounding infrastructure to make it work.”
AERA’s Levine also had concerns about the tiered approach. “While these tiers might be useful in the context of smaller scale, more locally based evaluation decisions,” Levine said, “they are less useful in supporting state policymakers’ and education administrators’ use of cumulative knowledge in education.”
Under ESSA, states make the ultimate decision on whether evidence on a subject is “reasonably available” when deciding how it should be required. That standard is likely to come into play in areas like teacher professional development, for which research on effectiveness has been mixed.
Across the board, experts argued that the Education Department will need to invest heavily in more training and technical assistance to boost state capacity to gauge research quality, as well as invest in more cross-state comparisons to find best practices as states begin to implement the new law. “You are going to have all these states with lots of freedom to make different decisions about things,” Bos said, “and you need to treat that as 50 different experiments and really learn from what these states are doing differently.”
Conaway agreed, suggesting the need for evaluations of implementation in different states could encourage more researchers to replicate prior studies, building a stronger evidence base for programs.
That could open new opportunities for partnerships between researchers and state education officials. The No Child Left Behind Act spurred a major expansion of education research in both topic areas, like school turnaround models, and in research methods. Experimental studies, for example, were relatively uncommon in education research because of time, expense, and the difficulty of finding large enough groups of students and then withholding interventions from some of them to create an appropriate comparison group. Federal guidance and grants in the NCLB years greatly expanded the number of such experimental studies conducted in education.
Prior federal research projects, such as the Institute of Education Sciences’ research alliance grants and grants to build up state longitudinal student databases, have helped lay the groundwork for ESSA’s approach, said Paige Kowalski, the vice president of policy and advocacy for the Data Quality Campaign, said. “Five years ago, we just weren’t having these conversations and state education agencies weren’t really in the position to act in this way,” Kowalski said. “We were just having the conversations around moving from a model of compliance to one of service.”
Next Generation i3
President Obama’s Investing in Innovation grant program, which lent its tiered-evidence framework to the law, has also survived in a more targeted form as Education Innovation and Research grants.
The, include early-phase grants to develop, implement, and test new and promising programs. Second-level grants would support full, rigorous evaluations of programs that have already been successfully implemented.
The third level of grants would target programs that have shown “sizable, important impacts” in those evaluations. These grants would support both scaling up and, interestingly, replication studies to make sure the original results were correct and “identify the conditions in which the program is most effective.”
The innovation and research grants were a big win for Jon Baron, the vice president of evidence-based policy at the Laura and John Arnold Foundation. “I think it’s an important toehold, and potentially at some point, it is a step forward,” he said.
A version of this article appeared in the January 06, 2016 edition of Education Week as ‘Evidence’ Requirements Are Redefined in ESSA