Study Relates Cautionary Tale Of Misusing Data

Save to favorites
Print

Email Facebook LinkedIn Twitter

Copy URL

Under federal law, states have to break out test results by race, ethnicity, and other student subgroups. But two Texas researchers caution that such data could be misused if educators aren’t better prepared to interpret the information.

Just such a situation, they say in a new paper, occurred at a school where the researchers were working. For five years, a research team from the University of Texas at Austin partnered with an unnamed, high-poverty urban high school to improve mathematics instruction.

At the end of that time, write researchers Jere Confrey and Katie Makar, the school had posted a 25 percent increase in the passing rate on the math portion of the state exit test, the Texas Assessment of Academic Skills.

But when the 2000 TAAS results came out, the passing rate in math for African-American students in the school dropped below the 50 percent that was deemed acceptable by the state. (The proportion that passed was 48.4 percent.) As a result, the school was labeled “low performing.”

The “No Child Left Behind Act” of 2001, which was modeled in part on the Texas accountability system, requires all public schools to disaggregate test-score data.

According to the authors of the Texas study, teachers were admonished to focus their instruction on raising the achievement of black students. The school suggested adjusting its improvement plan so that all African-American 9th and 10th graders would be assigned peer tutors. It also called for meeting with teachers from a neighboring high school that had gotten off the low-performing list by requiring all its black students to attend lunchtime tutoring.

At a meeting with district administrators to discuss the test results, the authors write, virtually no talk about the 25.5 percent of Hispanic students or 14.3 percent of white students who also failed the test took place.

The researchers expressed concern to school administrators that the proposals would unfairly target African-American students for specific interventions despite how they performed. Shortly thereafter, they say, their partnership with the school was abruptly terminated.

Natural Variation

In the paper, to be published later this year by Harvard University’s graduate school of education, the researchers note that in planning its improvement strategies, the school failed to look at the performance of African-American students over time or at the distribution of test scores.

Had they done so, conclude Ms. Confrey and Ms. Makar, whose paper was presented at a conference at Harvard this spring, they would have seen that the passing rate for black students had been on an upward trajectory for the past five years and generally mirrored improvements at the state and district levels.

The one-year drop in the passing rate for black students, based on a test group of only 31 African-American youngsters, still fell within one “standard deviation” of their projected passing rate, close enough so that the drop could have been caused by measurement or sampling errors, the paper says.

If one student had answered just one or two more questions correctly, the authors estimated, the school would not have earned the low-performing label.

Moreover, when Ms. Confrey and Ms. Makar compared the distribution of African-American students’ scores with those for nonblack 10th graders, they found a substantial overlap, with individual black students scoring across the spectrum.

“While we certainly recognize that a passing rate of only half the students is cause for concern and significant efforts at intervention,” they caution in the paper, “we believe that the policy should be based on better-informed ideas of variation.”

‘Bubble Kids’

Noting that many schools try to improve their performance by focusing on “bubble kids"—or those whose scores fall just below passing—the researchers undertook two analyses.

First, they looked at whether the state was closing the achievement gaps between different racial and ethnic groups in 10th grade math over time by looking at the percent of students in each group passing the exam and, then, at the average math score for each subgroup. When the latter method was used, any closing of such gaps became much less apparent.

Next, the authors examined the math-test scores of 7th graders at a Texas middle school compared with the same students’ scores the previous year. Surprisingly, of students who failed but were close to passing in grade 7, more than one-third had been high performers in 6th grade.

“This variation in the performance of particular students raised significant questions about what is being measured for these students each year,” write Ms. Confrey and Ms. Makar.

When the pair plotted the performance of 300 students in grades 8-10, chosen randomly across Texas, again they found that quite a few students performed differently on TAAS across the grades. While many students lost ground over the two-year period, a significant number performed substantially better.

“One of the concerns with just having a single measure is we sometimes see schools then acting on that measure to try and bring their scores up, rather than thinking about what the needs of the students are,” said Ms. Makar, a doctoral candidate in mathematics education.

Based on their analyses, the University of Texas researchers advise schools to consider statistics besides the percent passing in any subgroup before designing instructional strategies. Those include sample size; test performance over time; mean and median scores; and the distribution of scores by performance quartiles, because students in the lowest quartile often get ignored if their scores fall well below the passing standard.

“It’s important not to simply react to ‘fix’ the problem and get out of the situation,” cautioned Ms. Confrey, a professor of mathematics education. “One has to interpret the data in light of the activities at the school.”

Lynn Olson

Lynn Olson was managing editor of special projects for Education Week. She also covered national policy (including “P-16 issues” issues, NCLB standards, accountability, and reform), assessment and testing.