The Bill and Melinda Gates Foundation’s intensive teaching partnership projects show signs of promise, but they have not yet translated into widespread achievement gains for students.
That’s one of the major results from three interim reports released today by the RAND corporation covering implementation of the grants, access to effective teaching, and impact on student outcomes, through 2013-14. Researchers are still analyzing another two years of data, so this is really a teaser of things to come.
Gates announced the intensive partnerships in 2010. The foundation gave grants to three districts—Memphis, Tenn., Pittsburgh, Pa.; and Hillsborough, Fla.; and to one consortia of charter schools in California. The grantees agreed to craft new teacher-evaluation systems (including consideration of student achievement), and to re-orient professional development, teacher career options, and hiring around those systems.
The Gates investment came almost at the same time as the U.S. Department of Education’s Race to the Top competition, which prioritized similar themes.
Overall, the researchers underscored one major factor: All of the districts made the significant changes they promised to the Gates Foundation in their plans, despite struggling with certain elements like professional development.
“This isn’t an example of a reform that wasn’t implemented, or where there was really strong resistance. Quite the opposite,” said Michael S. Garet, the principal investigator on the implementation report and a vice president of the American Institutes of Research.
Education Week reported that as of 2013, Gates had spent nearly $700 million specifically on its teacher-quality initiatives. Of that, some $281 million went towards the Intensive Partnership sites.
“The findings so far confirm for us that changing systems to improve teaching quality is very complex work and requires persistence and patience,” said Mary Beth Lambert, a spokeswoman for the foundation. It remains hopeful that the final two years will show strong results, and will build on all of the findings, she added.
The reports are a densely packed 500 pages, so you’ll want to read them all for the full picture. Below is a small sample of the findings.
1. The grants’ impact on student achievement is mixed—but appears to be trending upwards.
Overall, the researchers found that there were few statistically significant gains over the time period studied. In one district, Memphis, student achievement dipped for a few years in 3-8 math, but had begun to rebound by 2014; in Pittsburgh, achievement went up in 3-8 math. In both cases, only a few years’ estimated impacts were actually statistically significant.
Take a look at the graph below, which represents Hillsborough’s 3-8 achievement scores. The white line is the intervention and the blue band the confidence interval; the x-axis represents standard deviations. As you’ll note, any impact is pretty close to zero overall.
It’s challenging to measure the impact of district-level initiatives since there’s no real control group, so the researchers used two different methods for their analysis. First, they used a “difference in difference” approach, comparing the district’s achievement trajectories to those in other districts and accounting for their different demographic factors. To cross-check those results, they also created a “synthetic” control group for each district populated by students elsewhere in each district’s respective state that closely matched the population of the district.
Interpreting these findings is challenging. It took several years for the grantees to put all of the various components in place, so perhaps it’s not surprising that any benefits are only now emerging. Also, keep in mind that these are averages across many schools in a district, which tend to obscure particularly high- or low-flying schools.
“It’s possible that this is a worse-before-better situation; we just don’t know,” Garet said. “We really need the next two years of data.”
Another way to consider these findings is to compare the 2013-14 effect sizes to other types of interventions. Below, you can see how the reading findings stack up. According to the report, the teaching reforms didn’t have as big an impact as classroom-level interventions, such as cmoprehensive school reform, or CSR, programs (think Success For All). But it was stronger than other district interventions, such as charter schools or technical assistance (DAIT, below.)
Most importantly: RAND will continue to analyze results in 2014-15 and 2015-16, so stay tuned.
2. Teachers’ enthusiasm for evaluation has declined.
The reports indicate that teachers, in general, did find the observation portion of the system helpful for reflection and changing practices. Between 60 and 80 percent of teachers in each district said that the evaluations helped pinpoint areas of weakness and instruction.
But they were not convinced that the entire evaluation apparatus was really going to help students: The proportion of teachers who said it would declined in the three districts since 2011—most precipitously in Pittsburgh. In site visits to each of the districts, teachers reported finding the evaluations time-consuming and that preparing for being observed took up a lot of preparation time. (Enthusiasm is higher among teachers in the charter schools in the consortia that won the fourth Gates grant.)
3. Professional development remains a challenge.
This finding probably isn’t that surprising, but still is noteworthy. One of the main goals of the project was to generate individual, actionable feedback for each teacher. But, while principals did report using the results from the evaluation for professional development, they didn’t always have the best tools or infrastructure for doing so.
“Sites are finding it difficult to link individual PD to needs identified by the teacher-evaluation process, a challenge that stems from incomplete information in the evaluation measures, from a lack of PD options that are tailored to areas of weakness that those measures identify, and from insufficient time for principals to offer tailored feedback to each teacher,” the authors wrote in the report.
Gates has invested a lot in personalized professional development over the last several years, probably as a result of these kinds of challenges cropping up along the way.
4. Teacher-evaluation results skew positive.
As is the case elsewhere, teachers got pretty good results from the new teacher-evaluation systems. Pittsburgh initially identified a higher percentage of weaker teachers, while Memphis gave out the most top ratings.
5. The grants didn’t do much to affect teacher-distribution patterns.
Here again, the results were pretty mixed.
The districts started with good distribution patterns, with low-income and minority students generally getting slightly better access to effective teachers than their peers. By the end of 2013-14 school year, that pattern had persisted, and in some cases gotten even better (Memphis), and in one case slightly worse (Hillsborough.)
But these patterns do not seem to have been a result of any particular district effort. Though the districts tried incentives, like extra pay for teaching in schools with more disadvantaged students, those programs don’t seem to have had a great uptake or much of an impact. (This is consistent with other research.) Instead, it appears that more-effective teachers are either improving their performance or replacing departing teachers who were less effective.
We’ll continue to dig into the findings. In the meantime, leave a comment and tell us what you found of interest.
The Gates Foundation commissioned the RAND evaluations. (It also provides support for Education Week.)
Charts courtesy of RAND.
For more on the Gates projects:
- Combined Measures Better at Gauging Teacher Effectiveness: Study
- Q & A: Melinda Gates Talks Teacher Quality