What NAEP Can’t Tell Us About Charter School Effectiveness (Opinion)

Save to favorites
Print

Email Facebook LinkedIn Twitter

Copy URL

Jonathon Christensen, Lawrence Angel

Jonathon Christensen is a research coordinator for the Center on Reinventing Public Education at the University of Washington’s Daniel J. Evans school of public affairs, in Seattle. Lawrence Angel is a research assistant at the center.

Since the data do not allow comparison of charter schools on more than one student attribute at a time, we cannot know whether the student populations of charter and conventional schools are very much alike or different.

Within a day of the release of the highly anticipated 2005 National Assessment of Educational Progress data in October, contending parties in the charter school debate had released claims that the findings supported their points of view. Meanwhile, we have continued to study the data to determine what conclusions can be drawn correctly. As it turns out, the claims from both sides overstate the extent to which determinations can be made about the effectiveness of the charter schools included in the sample.

The NAEP tests are part of an extensive effort undertaken by the National Center for Education Statistics to document the performance of America’s schools. NAEP data contain test results for a national sample of public and private school students in grades 4, 8, and 12; state-level samples include only public school students in these grades. The 2005 NAEP report draws on an impressive data set, containing math scores for 343,000 students and reading scores for 336,000 students from more than 17,600 schools nationwide.

The data are useful as an indication of how students are currently performing overall, both nationally and in each state. But they are much less useful as a tool for comparing the effectiveness of one type of school to that of another.

In 2003, NAEP began reporting separate results for charter schools. With the release of the 2005 NAEP data, charter school performance can be viewed at two points in time. Fourth graders in charter schools were first assessed in 2003, and a new set of charter 4th graders was included in the 2005 data.

The usefulness of these data for comparing charters over time is limited, however, since totally different groups of students were tested for the 2003 and 2005 reports. Moreover, the data are not helpful for comparing the effectiveness of charter schools with that of conventional public schools. Incomplete information on family background and the inability to track individual students over time allow school effectiveness to be confounded with student- background characteristics affecting performance. Further challenging reliability, the results are presented as broadly aggregated averages, which likely conceal considerable variation within groups.

It is very difficult to distinguish the impact of a school from other factors influencing the performance of a student. The quality of the school a student attends can affect his or her test scores, but so can other factors, such as parents’ education level, family background, and the quality of schools attended earlier.

— Susan Sanford

In a testing system like NAEP, where we know only a little bit about the family backgrounds of students and nothing about the schools they attended previously, it is hard to know what causes differences in schools’ test scores. If charter students perform at a higher or lower level than students in conventional public schools, the difference probably isn’t fully attributable to charter school attendance.

Assessing the relative effectiveness of charter and conventional public schools requires that researchers isolate the impact of school attendance from other influential factors. For 4th graders, comparisons of NAEP scores by group are strictly limited. Student-background factors, such as gender, race, family income, English-learner status, disability, or inner-city location, are controlled for only one at a time. But these factors are always found in combination, and charter schools and regular urban public schools serve many children who are educationally disadvantaged in multiple ways. Comparisons that look sensible might therefore be misleading.

It would be possible, for example, to compare charter and conventional public schools that serve the same proportions of African-American children. But the students in the two groups of schools could still be very different if, for example, charter school parents had much more (or much less) education than parents in the conventional public schools. Since NAEP data do not allow comparison of charter schools on more than one student attribute at a time, we cannot know whether the student populations of charter and conventional schools are very much alike or different.

NAEP data also make it impossible to distinguish different kinds of charter schools. This is important, because charter schools differ significantly from state to state, and even within particular localities. Reporting on charter schools’ performance as a whole conceals valuable information about the impact of various curricula, organizational structures, and accountability systems.

Perhaps the biggest potential pitfall of interpreting NAEP data is making an inference about an individual or school based on aggregate data for a group.

In examining the aggregate data for charter schools in Delaware, for example, one might discover that the average scale score for 4th graders tested in mathematics is 235. Of course, this is an average, and one cannot simply conclude that most 4th graders in Delaware score around 235. Maybe all Delaware charter school students have about the same score. Maybe student scores range from under 100 to over 500, with big differences between ethnic groups or between city and rural schools. The data available don’t let us determine which is true.

Presenting results on average, as NAEP is designed to do, conceals important variations among individual students and schools. At the same time, it might wrongly give the impression of causal relationships where none exist. For example, a reasonable interpretation of the recent NAEP results might be that “4th grade charter school students, on average, are earning slightly better NAEP mathematics scores than the charter students included in the test two years ago.” But the students and the schools included in this year’s NAEP are not the same as the ones included two years ago. So it is impossible to say that charter schools are teaching their students better than before, or that a definable group of students is learning more than in the past.

No matter how hard analysts try, it is not possible to squeeze good results on charter school effectiveness out of the NAEP data. There are many serious studies under way and about to be published, and though these too will be controversial, they can do a much better job of comparing like with like and separating school results from the effects of student characteristics.