‘No Effects’ Studies Raising Eyebrows

By Debra Viadero — March 31, 2009 8 min read
  • Save to favorites
  • Print

Like a steady drip from a leaky faucet, the experimental studies being released this school year by the federal Institute of Education Sciences are mostly producing the same results: “No effects,” “No effects,” “No effects.”

The disappointing yield is prompting researchers, product developers, and other experts to question the design of the studies, whether the methodology they use is suited to the messy real world of education, and whether the projects are worth the cost, which has run as high as $14.4 million in the case of one such study.

Spate of Studies

“Impact Evaluation of the U.S. Department of Education’s Student Mentoring Program”

Purpose: To compare outcomes for children in grades 4-8 who had been randomly assigned to receive or not receive services through the department’s student-mentoring grants program.

Date Issued: Feb. 25, 2009

Results: No overall, statistically significant effects were found for any of the 17 measures studied, although some positive effects appeared for certain subgroups of students.

“Achievement Effects of Four Early Elementary School Math Curricula: Findings from 1st Graders in 39 Schools”

Purpose: To compare four mathematics curricula that reflect different approaches to teaching that subject in the early grades.

Date Issued: Feb. 24, 2009

Results: Statistically significant, positive effects were found for two programs, but none for the other two.

“Effectiveness of Reading and Mathematics Software Products: Findings From Two Student Cohorts”

Purpose: To evaluate the effects of 10 commercial software products used at various grade levels.

Date Issued: Feb. 17, 2009

Results: Only one model produced statistically significant test-score gains across both years of the study. Two algebra programs produced positive effects in classrooms that had used the programs two years in a row.

“An Evaluation of Teachers Trained Through Different Routes to Certification”

Purpose: To compare the achievement of elementary school children, in the same grades and the same schools, randomly assigned to teachers trained through either traditional education schools or alternative-route programs.

Date Issued: Feb. 19, 2009

Results: No statistically significant differences were found between the two groups

Source: U.S. Department of Education

But proponents of the methodology say those critics ought to pay more attention to the message than to the messenger.

“I just think that’s the way the world works,” said Jon Baron, the executive director of the Coalition for Evidence-Based Policy, a Washington-based advocacy group. “The good news is that some things do work, and those are the things we should focus on and scale up.”

The studies are part of a new generation of so-called “scientifically based” research that was set in motion by the institute—the main research arm of the U.S. Department of Education—when it was created in 2002.

The body of research employs a study design called “randomized controlled trials,” in which subjects are randomly assigned to either an experimental group or a business-as-usual group. Although rarely used in education before the wave of studies backed by the IES, such designs are widely considered to be the “gold standard” for determining whether an intervention works.

Of the eight such studies released by the federal institute this academic year, six have produced mixed results pointing to few, or no, significant positive effects on student achievement.

They include studies on: school-based mentoring programs in elementary school; commercial software programs for teaching mathematics; various certification routes for teachers; teacher-induction programs; interventions for boosting literacy instruction for disadvantaged preschoolers and their families; and professional-development initiatives in reading.

In addition, the research agency’s final evaluation of the federal Reading First program, which uses a research design that differs slightly from the randomized controlled approach, found that the $6 billion federal reading program improved young children’s decoding skills, but failed to make dramatic differences in reading comprehension.

On the other hand, an ongoing study of “double dose” reading classes for struggling 9th grade readers is showing positive results. And a head-to-head comparison of four different elementary math curricula identified two, philosophically different programs that gave 2nd graders an added boost in that subject over the standard curricula.

‘Tin’ Standard?

Still, the overall results are leading some experts to question the value of the recent spate of randomized controlled studies.

“It’s not a bad idea to get people more organized and more motivated to do more experimental studies,” said Linda Darling-Hammond, a Stanford University education professor and the former lead adviser on President Barack Obama’s education transition team. “But we’re spending a lot of money on some pretty poor designs which are not likely to give us results. It’s as though in the education community we’ve taken the gold standard and turned it into a tin standard.”

Ms. Darling-Hammond points out that at least two of the studies—one on school-based mentoring and one that compared teachers who were alternatively certified with those who had come to the classroom by more traditional routes—did not have “clear treatments.” In other words, the control group and the treatment group were too similar, in her view, in important respects.

In the case of the teacher-certification study, for instance, some of the alternatively certified teachers had taken as many education courses as peers who had graduated from education schools.

In the mentoring study, which focused on school-based mentoring programs for students in grades 4-8, 35 percent of the students in the control group received mentoring services anyway. Fourteen percent of the students in the mentoring group never got matched up with a mentor.

Another scholar, Sean P. Corcoran, an assistant professor of economics at New York University, worries that the studies, many of which have been set in schools with high concentrations of poor students, aren’t producing findings that apply to a wide range of educational settings. “What most policymakers are looking for is: What will work in my school?” he said.

The teacher-certification study is a case in point, Mr. Corcoran said.

“The schools sampled were those that routinely hired alternatively certified teachers, and those tend to be hard-to-staff schools to begin with,” he said.

In such hard-to-staff schools, where research has long shown that teacher quality is comparatively weak, it’s no surprise that the alternatively trained teachers were just as effective as those who had taken more traditional routes into the classroom, he added.

Yet study readers may come away with the impression that the findings offer a broader indictment of traditional education school training. “Interpretation is the biggest problem,” Mr. Corcoran said. “It’s not that these are poorly designed studies.”

‘Dosage’ at Issue

Michael Milone, a Placitas, N.M.-based assessment specialist who helped develop some of the programs tested in the educational software study, faults that study for paying too little attention to whether teachers were using the programs or not. ( “Reading, Math Software Found to Have Little Effect on Scores,” March 18, 2009.)

“In looking at these complex evaluation studies, it’s almost like no one looks at things like how many of the kids show up,” he said.

Spate of Studies continued

“Enhanced Reading Opportunities: Findings From the Second Year of Implementation”

Purpose: To measure and compare the impact of two programs that aim to improve struggling 9th graders’ literacy achievement by providing an extra reading class during the school day.

Date Issued: November 2008

Results: Both programs were shown to have a statistically significant positive effect on student achievement.

“Impacts of Teacher Induction: Results From the First Year of a Randomized Controlled Study”

Purpose: Evaluate the impacts of programs used in 17 districts to provide support for beginning teachers in elementary schools.

Date Issued: October 2008

Results: No statistically significant differences were found between the treatment and control groups in terms of student achievement, teachers’ practices, or retention rates for teachers.

“A Study of Classroom Literacy Interventions and Outcomes in Even Start”

Purpose: To find out whether federal Even Start programs with a heavier emphasis on literacy instruction will lead to better outcomes for children and families.

Date Issued: September 2008

Results: For all seven measures of literacy and language, there were no statistically significant differences between children getting more literacy-rich instruction and those in regular Even Start programs. The program did lead to improvements in parenting skills, though, as well as in children’s social skills.

“The Impact of Two Professional-Development Interventions on Early-Reading Instruction and Achievement”

Purpose: To weigh the impact of two professional-development programs—one with added support from school-based coaches and one without—both aimed at improving teachers’ knowledge of “scientifically based” practices for teaching reading.

Date issued: September 2008

Results: Although teachers’ knowledge grew, there were no differences in test scores after one year between 2nd graders whose teachers took part in the programs and their peers whose teachers did not. Having reading coaches available for teachers produced a small positive effect, but it was not statistically significant.

Source: U.S. Department of Education

The educational technology study tracked the number of hours teachers used the software, for example. “But you also want to know how many kids work on the program,” Mr. Milone said. “Is the dosage intensity and duration appropriate for that student?”

The analogy in medicine, he said, might be to evaluate a drug that patients don’t take as prescribed. “If it doesn’t work,” he added, “what does that say about the medication?”

Limits to Uses

Though he was involved briefly in early partnerships with the federal research agency to make greater use of randomized studies, Harris M. Cooper agrees that randomized controlled trials, like any research design, have limitations. One is that they are better at picking up short-term effects than they are at measuring long-term results.

The studies are also better suited to detecting the effects of highly specific interventions than they are at broader education improvement efforts farther removed from the classroom, experts say.

“RCTS can be oversold, but at the same time, they are a critical part of our research arsenal, and the best approach to getting our arms around a problem, especially if they are involved with multiple, complementary [research] methods,” added Mr. Cooper, a professor of education, psychology, and neuroscience at Duke University in Durham, N.C.

Indeed, various panels of the National Academies, a key source of advice to Congress on scientific matters, have concluded that, when it comes to determining cause and effect, randomized controlled trials are the most effective research design to use.

What they cannot do, though, is reveal what’s happening inside the “black boxes” of classrooms.

While randomized studies were underutilized in education for a long time, Mr. Cooper said, “I think I would also like to see proponents be appropriately humble about what these studies can tell us.”

Lessons Learned

For their part, federal education officials say the randomized studies carried out so far have often focused on disadvantaged, inner-city schools because that is where the need for reliable solutions to education problems is greatest. And, in the case of the teacher-certification study, that is where the alternatively certified teachers are.

If some of the experiments were less concerned with fidelity to the intervention, officials add, that reflected an intentional decision to study how educational practices are used—or not used—-in the real world, rather than in environments controlled by program developers.

“Lots of social programs are less effective than people think,” said Grover J. “Russ” Whitehurst, who headed the Institute of Education Sciences from its start until last November. “I think it’s in the nature of evaluation science to find more inconclusive findings than positive findings, and that’s informative. If you’re spending a lot of money on something that’s believed to be effective, and now you have questions about its effectiveness, then I think it’s a positive thing.”

Finding positive effects is also more challenging in education, because typically students in both the treatment and control groups are making academic progress.

“It’s not a question of whether a particular new intervention is efficacious at all,” said Richard J. Murnane, a professor of education and society at the Harvard Graduate School of Education. “It’s a question of whether it’s better than what we would’ve been doing otherwise.”

Randomized studies will also be more useful as they become part of an ongoing program of research, Mr. Murnane said. For example, when a large randomized study found that students who participated in after-school programs received no special boost in test scores, compared with those not participating, the IES underwrote a second study to see what would happen if those after-school programs had a stronger academic component.

The federal research agency receives no special allocation, though, that would enable it to build a thoughtful, long-term plan of study, according to Mr. Whitehurst, who now directs the Brown Center on Education Policy at the Washington-based Brookings Institution.

“So IES ends up doing a bunch of one-off evaluations that are either desired by Congress or desired by some other program office in education that has money available,” he said. “That makes us like MDRC or Mathematica: a research organization that does mostly studies that somebody else is able and willing to fund.”

Phoebe H. Cottingham, the commissioner of the institute’s National Center on Education Evaluation and Regional Assistance, which oversees many of the large-scale studies, said policymakers have learned some lessons about choosing educational models to be tested with randomized studies.

“Some of them were based on what are fairly weakly supported ideas,” she said. “It doesn’t mean you’re going to get an effect just because something worked in one efficacy trial.

“We think we’re going to have more luck with the next cohort of studies,” she added.

A version of this article appeared in the April 01, 2009 edition of Education Week as No Effects’ Studies Raising Eyebrows


This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Student Well-Being Webinar
Attend to the Whole Child: Non-Academic Factors within MTSS
Learn strategies for proactively identifying and addressing non-academic barriers to student success within an MTSS framework.
Content provided by Renaissance
Classroom Technology K-12 Essentials Forum How to Teach Digital & Media Literacy in the Age of AI
Join this free event to dig into crucial questions about how to help students build a foundation of digital literacy.

EdWeek Top School Jobs

Teacher Jobs
Search over ten thousand teaching jobs nationwide — elementary, middle, high school and more.
View Jobs
Principal Jobs
Find hundreds of jobs for principals, assistant principals, and other school leadership roles.
View Jobs
Administrator Jobs
Over a thousand district-level jobs: superintendents, directors, more.
View Jobs
Support Staff Jobs
Search thousands of jobs, from paraprofessionals to counselors and more.
View Jobs

Read Next

Federal What Works Clearinghouse: Inside 20 Years of Education Evaluation
After two decades of the What Works Clearinghouse, research experts look to the future.
4 min read
Blue concept image of research - promo
Federal One of Kamala Harris' First Campaign Speeches Will Be to Teachers
Vice President Kamala Harris will speak to the nation's second-largest teachers' union at its convention in Houston.
1 min read
Vice President Kamala Harris campaigns for President as the presumptive Democratic candidate during an event at West Allis Central High School, Tuesday, July 23, 2024, in West Allis, Wis.
Vice President Kamala Harris campaigns during an event at West Allis Central High School in West Allis, Wis., on Tuesday, July 23, 2024. Harris will speak at the American Federation of Teachers convention on Thursday, July 25.
Kayla Wolf/AP
Federal AFT's Randi Weingarten on Kamala Harris: 'She Has a Record of Fighting for Us'
The union head's call to support Kamala Harris is one sign of Democratic support coalescing around the vice president.
5 min read
Randi Weingarten, president of the American Federation of Teachers, speaks at the organization's annual conference in Houston on July 22, 2024.
Randi Weingarten, president of the American Federation of Teachers, speaks at the organization's biennial conference in Houston on July 22, 2024. She called on union members to support Vice President Kamala Harris the day after President Joe Biden ended his reelection campaign.
via AFT Livestream
Federal Biden Drops Out of Race and Endorses Kamala Harris to Lead the Democratic Ticket
The president's endorsement of Harris makes the vice president the most likely nominee for the Democrats.
3 min read
President Joe Biden speaks at a news conference July 11, 2024, on the final day of the NATO summit in Washington.
President Joe Biden speaks at a news conference July 11, 2024, on the final day of the NATO summit in Washington. He announced Sunday that he was dropping out of the 2024 presidential race and endorsing Vice President Kamala Harris as his replacement for the Democratic nomination.
Jacquelyn Martin/AP