Call this one the case of Abell v. NCTAF. In an 80-page report last fall, the Baltimore-based Abell Foundation claimed to debunk the idea that fully licensed teachers are any better than those who lack such state credentials. Within days, the National Commission on Teaching & America’s Future shot back with a 50-page rebuttal that tore apart the Abell document.
That response was then run over the coals a few weeks later in a 17-page rejoinder, co-written by Abell’s senior policy analyst. Throughout the exchange, the charges flew like chairs on “The Jerry Springer Show.” Using words like “shameful” and “dishonest,” the parties accused each other of hypocrisy and of harboring ulterior motives.
Think you know what the research says about effective teachers? Think again.
What are the rest of us to make of this apparent tit for tat?
For starters, if the pitch of the debate seems higher than in other areas of education research, it’s partly because the stakes are also higher. Vouchers and bilingual education may be controversial, for instance, but they’re still just slices of all that goes on in the field. It is a given, however, that every public school student in America has a teacher. As Lee S. Shulman, the president of the Carnegie Foundation for the Advancement of Teaching in Menlo Park, Calif., says: “Because [the teaching profession] is so universal and public, in every sense, it is under very, very special scrutiny.”
But it’s more than that. Underlying much of the debate about teacher quality is a fundamental rift over what to make of the existing body of research on the topic. There’s actually little disagreement about what’s in the research, but experts from different camps quickly part ways when it comes to how the data should be interpreted and how policymakers should respond.
In large measure, that is because while the past 30 years have seen a multitude of studies on the characteristics of effective teachers, what’s lacking are large-scale, controlled studies that get at one of the most crucial questions: How should teachers be trained?
“In this game where nobody has the definitive evidence, the person who ends up with the burden of proof loses,” says Ronald F. Ferguson, an economist and lecturer at Harvard University’s John F. Kennedy School of Government. “If you are someone who wants to claim that professional-development programs in general make a difference, you might be right, but you don’t have the evidence, so you lose.
“On the other hand, if someone wants to claim that professional-development programs, by and large, don’t make any difference, they would be on shaky ground as well.”
Help may be on the way. Many experts say a brand-new generation of research on teacher quality promises to bring the discussion to a new level. That such studies can now take place stems partly from increased state-level testing of student achievement, which can track the test-score gains of individual teachers’ students.
Researchers also are honing the way they define the characteristics of teacher effectiveness they’re measuring, so they’re better able to compare apples with apples when examining types of instructional practice or models of teacher preparation. If successful, such studies could reveal the missing links between how teachers are trained, what they do in the classroom, and how their students perform.
But first, what do most scholars agree on? Perhaps the least contested studies on teacher quality point to the importance of teachers’ basic skills.
Some of the most cited research on that topic was carried out by Harvard’s Ferguson, who has analyzed data from Arkansas and Texas, two states that had, at one point, required all teachers to take a basic-skills test as part of their relicensure requirements. In Texas, for example, Ferguson found that teachers’ scores on the skills exams had the strongest link to student achievement of four different variables he examined. The other variables, in descending order of effects, were teacher experience, class size, and whether teachers held master’s degrees.
Other comparatively uncontroversial findings show a link between student improvement and teachers’ knowledge of the subjects they teach. In a study of science and mathematics at the secondary school level, education professor David H. Monk of Pennsylvania State University found that the number of college-level courses that educators took in those content areas correlated positively with student achievement.
But his analysis—based on two federal databases of information on teachers and students—also hinted at a point of diminishing returns. For example, it seemed to make a big difference whether 11th graders’ teachers had taken at least five math courses, but each additional course thereafter did not translate into as big a student-achievement gain. Monk also found that teachers’ course-taking was more important than their attainment of a master’s or doctorate.
What a policymaker should make of all that depends on whom you ask. Last fall’s Abell Foundation report suggested that content knowledge and basic skills, such as verbal ability, should be the primary factors in states’ licensing of new teachers. After that, the study argued, it should be up to school leaders to decide whom to hire.
Underlying much of the debate is a fundamental rift over what to make of the existing body of research on the topic.
Abell favors strategies like Teach for America, a program that recruits noneducation majors from selective colleges, gives them a few weeks of training, and then places them in jobs at inner-city and rural schools for at least two years. Indeed, the teacher-candidates that TFA attracts have been shown to have exceptionally high verbal abilities. The thinking goes: Why shouldn’t schools be free to pick such recruits—even if they lack teacher-training coursework—so long as they’re able to demonstrate student improvement?
“If you’re going to hold people accountable, you’ve got to let them make decisions,” says Michael Podgursky, an economics professor at the University of Missouri-Columbia who co-wrote Abell’s rejoinder to the NCTAF critique. “This is no different than regulating food, and drugs, and all other kinds of fields, in that you’re focusing on the outcome. You’re protecting the public by focusing on the final product.”
Some experts, though, hold less faith in the ability to predict student success based on teachers’ backgrounds. Teachers’ basic skills and subject-matter knowledge may matter, they say, but the question is how much. Gene V. Glass, an education professor at Arizona State University in Tempe, recently reviewed the literature on teacher effectiveness to write a chapter for an upcoming book. He believes that such variables may be overrated. For example, he says, a close reading of the research suggests that a very large increase in teachers’ basic skills would translate into a much smaller amount of improvement in student performance.
“The relationships,” he says, “are so small that they are not considered useful.”
Moreover, many scholars say that what isn’t known is how many Teach For America-style candidates are out there and willing to work where they’re needed. The country’s public education system employs 2.9 million teachers. An estimated one-quarter of them teach in schools where 50 percent or more of the students take part in the federal free or reduced-price lunch program. Typically, high-poverty schools face the greatest need for new teachers.
The Carnegie Foundation’s Shulman explains the problem with an analogy he credits to a colleague: “Everybody takes for granted that some people can play piano by ear and don’t need instruction. But if you needed 2 million or so people to play the piano in our schools, suddenly you begin to wonder with more precision what it takes to do this well.”
Which brings the discussion to perhaps the most disputed— and some say most important—research topic related to teacher quality: how teacher preparation is linked to student achievement. For many experts, the issue boils down to whether it matters that someone has been trained in teaching methods and theories.
It’s a critical question at a time when many schools are finding it hard to fill all their teacher vacancies, and when states are showing greater interest in “alternative routes” that allow candidates to enter the profession without going through a traditional teacher education program.
To get at the question, some scholars have examined the types of state licenses that teachers hold. Dan D. Goldhaber, an economist at the Urban Institute, and RAND Corp. researcher Dominic J. Brewer carried out one such study two years ago. The two analyzed a federal database that tracked teacher characteristics and student performance over several years.
The big discussion: how teacher preparation is linked to student achievement.
One finding indicated that the students of math teachers with so-called “emergency” licenses did no worse than those who held standard credentials. (Emergency licenses are generally given to new teachers who have yet to fulfill all their states’ requirements for full credentials.)
The study touched off a debate that played out in the pages of scholarly journals for nearly a year. Linda Darling-Hammond, then the executive director of the National Commission on Teaching & America’s Future, co-wrote a response arguing that the Goldhaber and Brewer conclusion on emergency-certified teachers really said little about how the educators were trained. According to Darling-Hammond, one couldn’t assume that teachers with emergency licenses lacked coursework in education.
Delving into the data herself, she said she found that large numbers of teachers holding emergency licenses actually had education school training, but lacked coursework in other areas required for full licensure in their states. Experienced teachers who transfer between states, for example, often are not given full credentials until they complete a class in regional history.
Meanwhile, Goldhaber and Brewer defended their work in a published response, in which they argued that their original report never claimed to provide evidence of the relative merits of different models of teacher preparation; it only contended that there appeared to be no difference between teachers with emergency and standard credentials. Wide variations across states’ licensure requirements made it difficult to do such an analysis from national data, they said.
In an effort to compare apples with apples on the credential question, David C. Berliner, an education professor at Arizona State University, has begun to examine the licensure question in two districts within his own state. Along with ASU doctoral student Ildiko I. Laczko-Kerr, he has sought to match pairs of teachers who have similar backgrounds and characteristics except for their licensure status. In an unpublished paper they presented at last spring’s American Educational Research Association conference, the two reported they had found that the students of fully licensed educators consistently posted greater gains than those whose teachers lacked them.
In fact, they said, the effect of having a standard license appeared to translate into about three or four months’ student-achievement growth. The implication is that having three emergency-licensed teachers could put a student a year behind his or her peers academically. Berliner and Laczko-Kerr continue to match up more teachers to test their hypothesis, and so far, they’re up to about 100 pairs.
If previous studies are an indication, the number of comparisons will likely drive some of the debate over the study once it is published. Goldhaber and Brewer’s finding that emergency credentials didn’t seem to matter was based on a subsample of about 60 math and science teachers—a number Darling-Hammond suggested was too small to generalize from.
Also, like Goldhaber and Brewer’s work, the Berliner and Laczko-Kerr study may not give a clear picture of the benefits of different types of teacher preparation. Besides the problem of not really knowing how many emergency-licensed teachers came through alternative routes and how many came through conventional schools of education, another consideration clouds such analyses: All alternative routes are not the same, nor are all traditional teacher-preparation programs.
The connection between program content and teacher effectiveness reflects a marriage that many experts see as highly promising.
For example, Georgia’s new fast-track licensing initiative, called the Teacher Alternative Preparation Program, allows candidates to be placed in their first teaching assignments after a short summer-training course. Contrast that with Colorado’s Project Promise—a 14-year- old program endorsed by the National Commission on Teaching & America’s Future— that puts midcareer recruits through a yearlong regimen that includes 20 weeks of student teaching, along with seminars on school law, the use of student assessments, and multiculturalism. The two programs bear little resemblance, and yet both are considered alternative routes.
Similarly, few would argue that there aren’t wide variations in the content and quality of traditional teacher-training programs. Indeed, Penn State’s Monk examined possible links between student achievement and their teachers’ coursework in teaching methods— and found a positive correlation—but he, too, conceded that likely variations among the content of those courses made it difficult to draw any definitive conclusion.
“You’ve got to find ways to describe what experiences they’ve had in teacher preparation that fit with what are effective, and be specific enough so that you won’t have something that has a general label with such a wide variety of things that go under it,” says Robert E. Floden, an education professor at Michigan State University in East Lansing.
A number of ongoing research projects are attempting to help sort things out. The Carnegie Corporation of New York is supporting two such studies. One, financed at about $1 million, involves a state-by-state review of alternative routes to describe what actually goes into the different programs. The other is an attempt to identify what a high-quality education school program looks like. Under that effort, about a half-dozen education schools will each receive up to $5 million to retool their programs in ways that produce more effective graduates.
“No program will be permitted to participate in this initiative that does not include as principal measures of effectiveness of their program student-learning gains made under the tutelage of graduates of the program,” says Carnegie official Daniel Fallon.
The connection between program content and teacher effectiveness reflects a marriage that many experts see as highly promising. In part, that exploration has been inspired by the work of researcher William L. Sanders, who has sought to perfect a method of linking individual teachers with the test-score gains their students made over time. The statistical tools he uses to estimate teacher effects are highly complicated, but the essential element is having an assessment system that tests virtually all students, each year, in a consecutive block of grades. Tennessee, where Sanders began his work, is one of a few states that do so.
But new federal requirements in the “No Child Left Behind” Act of 2001 signed by President Bush in January will require all states to test students in reading and math annually in grades 3-8. The measure could prove to be a boon for researchers by turning more states into laboratories in which different teacher characteristics can be measured against student test-score gains. So far, the kind of “value added” research Sanders has refined has been used mostly to identify effective teachers, not to try to explain what it is that makes them so. The Carnegie Corporation of New York is now underwriting a RAND study of value-added assessment.
“It is a worthy hypothesis that good teaching does not reside in our DNA, but is rather the result of good practice and behavior,” Fallon says. “So it is incumbent upon us to really try to discover what those practices are so that we can replicate them and attempt to help other teachers practice them.”
On the opposite coast from Fallon’s operation, the Carnegie Foundation for the Advancement of Teaching also is gearing up for a major study that examines teacher-candidates as they go through teacher-preparation programs and enter the profession. The idea, says Shulman, the center’s president, is to track how those candidates change in their understanding of subject-matter knowledge and teaching methods, and to see how that change translates into the way they teach once on the job. Researchers are now designing the project’s measurement tools, which will include paper-and-pencil tests as well as descriptions of teachers’ classroom practice.
Between 12 and 20 teacher-preparation programs could take part, says Shulman. “We see this as a prerequisite to doing any big study on teacher education,” he says.
In some ways, researchers’ focus on how teachers use their skills on the job brings teacher-quality research back to where it was 30 years ago. At that time, many experts had abandoned the idea that they could study the effectiveness of teachers without actually seeing them in action. That thinking led to a spate of large-scale, controlled studies—some of which the federal government paid for—in which scholars documented the practices and behaviors of teachers whose students showed impressive gains on standardized tests. In some studies, hundreds of schools were examined.
According to some studies, research support from the National Institute of Education fell by more than half between the early 1970s and mid-1980s.
Those early projects suggested several important practices, such as: Good teachers feel a sense of responsibility for their students’ success, they minimize the time spent on disciplinary issues, and they seek to tailor their instruction to the particular abilities of individual students in their class.
Many experts believed another generation of large-scale projects was warranted as a way to clarify those attributes—to look, for example, for variations across different subjects. But instead, many of the studies that followed were much less ambitious, in part, some contend, because of the difficulty in securing funding. According to some studies, research support from the National Institute of Education—the precursor to the federal office of educational research and improvement—fell by more than half between the early 1970s and mid- 1980s.
Instead of designing large, controlled experiments, researchers often had to settle for observing small numbers of educators, or they had to mine existing federal and state databases in search of correlations between student achievement and various “proxies” for teacher quality. That shift in approach has added to the debate that rages today by allowing for the claim that few recent large-scale studies exist to support many assertions about teacher effectiveness.
“The ability of researchers to mount experimental designs went way, way down, so you have a different genre,” says Darling-Hammond, now a professor at Stanford University. “Normally, with a line of research, you start with a grosser level of detail, and then you fine-tune it. That’s what I think would have happened if the funding had continued.”
In one sign that the times are changing yet again, a team of academics is now in the midst of an unusually large project known as The Study of Instructional Improvement. The effort involves a detailed examination of how teaching practices change in schools that take part in so-called comprehensive school reforms. The idea behind such initiatives is that a school will improve if it adopts wholesale a new approach toward instruction, such as the increasingly popular early-literacy program known as Success for All.
In total, teachers in 120 schools, including a control group, will be studied. Research-team member Brian Rowan, an education professor at the University of Michigan, says the core question to be studied is, “What are the critical elements of instruction that appear to be affecting achievement?”
To answer that, project organizers have designed an elaborate system of assessments. Working with Rowan are fellow University of Michigan researchers David K. Cohen and Deborah L. Ball, all three with the Consortium for Policy Research in Education. Cohorts of students will be tested twice annually, and teachers will fill out five- minute checklists 90 times a year that are aimed at recording what they do in the classroom. The study’s main underwriter is the Atlantic Philanthropies, which also supports coverage of international issues in Education Week.
To develop the measurement tools, the team also has received a $4 million grant from the federal Interagency Education Research Institute, a cooperative effort that includes the Education Department, the National Science Foundation, and the National Institutes of Health. The overall study also is receiving funds from OERI. Rowan won’t say what the total budget is for the project, which will continue collecting data for three more years.
“It’s a lot of money,” he says. “This is a very large and extraordinarily intensive, and therefore costly, study.” Will the return of such extensive research projects settle the debate over what makes for a good teacher? Few academics think so. But many suggest the aim is to refine the discussion, not to end it.
Arizona State University’s Glass, for one, believes the real point of continuing to pursue research on the topic is to help the public better understand the issue.
“What’s important,” he says, “is that the people whose interests are not so vested, or sunk in one side of the issue or the other, listen to the debate and make up their own minds.”
The Research section is underwritten by a grant from the Spencer Foundation.
A version of this article appeared in the April 03, 2002 edition of Education Week as Research: Focusing In on Teachers