Education Scholars Finding New 'Value' In Student Test Data

Save to favorites
Print

Email Facebook LinkedIn Twitter

Copy URL

Every year, school administrators at E.A. Cox Middle School in Maury County, Tenn., look for patterns in the state test scores of students assigned to individual teachers. Then they coax and cajole teachers who have been especially effective at raising achievement to teach subjects or grades the school is having trouble with.

“It’s kind of been a joke here at Cox that you don’t know what you’ll be teaching until school starts,” says Principal Debbie Steen. “The teachers are just very flexible, and they don’t get upset about that. The teachers are very dedicated to finding the best fit for the child.”

That approach seems to be paying off. In 2002, the 700-student school outperformed all other middle schools statewide with similar percentages of poor and minority students in the gains students made in mathematics averaged over three years. E.A. Cox’s reading gains topped all but one middle school with comparable student populations.

Soon, principals and teachers in Colorado, Ohio, and Pennsylvania will be armed with similar data, as part of pilot projects that rely on “value added” methods. Rather than simply rank schools on raw test scores, such analyses focus on the progress by individual students over time.

Some contend that such information provides a fairer way to judge schools, based on how much schools “add value” to a student’s knowledge and skills. The data also can help pinpoint a school’s strengths and weaknesses, right down to the improvements in individual teachers’ classrooms.

“Right now, a lot of states and districts rely on changes in cohorts,” explained Laura S. Hamilton, a behavioral scientist with the RAND Corp., a research institute in Santa Monica, Calif. “They’ll compare the performance of this year’s 4th graders with last year’s 4th graders. That confounds differences in the particular abilities and other characteristics of the students in each group with real changes in test performance.”

Value-added methods, she said, have “the potential to provide more accurate estimates of changes in test scores than we currently get with a lot of the systems being used. It also, I think, does a better job of communicating what we really want to know, which is the extent to which individual students are gaining or losing in terms of test scores.”

Pilot Projects

Probably the best-known proponent of value-added analyses is William L. Sanders, who directs the value-added assessment and research center for SAS inSchool, part of the Cary, N.C.-based SAS Institute, a software company. Since the early 1990s, Tennessee has used Mr. Sanders’ value-added techniques to show the public the gains by every school and district, averaged over three years. (“Sanders 101,” May 5, 1999.)

Working with the SAS Institute, the statistician is now exporting his methodology to other willing states and districts. To date, districts in 21 states have signed on, including the three, large-scale pilots in Colorado, Ohio, and Pennsylvania. Specifically:

In Pennsylvania, about 30 districts are participating in a $500,000 project this school year with SAS inSchool. To participate, districts had to test students every year in grades 3-8 and link test results with individual students. In September, the state board of education approved a plan that calls for every district to include a value- added component in its assessment system by 2005-06. “We’ve always understood that the fairest measure for our schools, for our school districts, is looking at where did your kids start, and where did they end up?” said Charles Zogby, the commissioner of education in Pennsylvania.

In Ohio, 42 districts are participating in a project coordinated by Battelle for Kids, a nonprofit group based in Columbus, that will generate value-added data for each school, again working with SAS inSchool.

In Colorado, about 40 districts have worked with Mr. Sanders over the past four years to produce value-added analyses for their schools, as part of a voluntary project started by the state department of education. Last year, legislators approved a separate “academic growth pilot program” that will track the progress of individual students in reading, writing, and math in participating schools and districts, beginning this school year. Starting with the 2005-06 school year, the law calls for every district in the state to take part in the “academic growth program.”

State Rep. Keith C. King, a Republican who sponsored the legislation, said he wanted to “give teachers tools whereby they could actually improve each student’s academic achievement.”

Under the state’s 1999 accreditation rules, schools and districts must demonstrate that all student subgroups, such as Hispanic students or students with disabilities, have achieved at least one year’s academic growth in a year’s time.

Gaining Momentum

Other states are using value-added methods for accountability.

Florida assigns letter grades to schools based partly on the learning gains of individual students. Jim Horne, the secretary of education, argued, “By being able to measure annual learning gains, that is a powerful place to be.”

One of the most powerful and controversial aspects of Mr. Sanders’ system is that it can reach beyond the school level to produce a measure of an individual teacher’s effectiveness, based on how the students in his or her classroom progress each year. In Tennessee, such information is shown only to school officials, who can use it as part of job evaluations, and to the teachers themselves.

Kip Reel, the superintendent of the 11,350- student Maury County, Tenn., schools, said his system uses the information to set performance goals for principals and to help evaluate individual teachers.

“I don’t deny that you can get a feel for how effective an individual educator is by visiting the classroom, or by observations or things like that,” he said. “But it’s a real big benefit to have some quantification that allows comparisons within the school or within the system.”

Starting this fall, parents in Tennessee also can go on the department of education’s Web site and find confidential projections for whether their children will pass the state tests required for high school graduation, as indicated by their performance to date. The site also projects the youngsters’ chances of earning a high enough score on the ACT college-entrance exam to gain admission to one of the state’s colleges or universities, earn an A or B average as a college freshman, or major in a technical field, such as computers or engineering.

‘Seize This Opportunity’

Until now, many states have faced two big challenges in doing such value-added analysis.

The first is that they did not test every student annually and so could not chart individuals’ progress from year to year. That situation will change under the federal “No Child Left Behind” Act of 2001, which requires states to test every student in grades 3-8 annually in reading and mathematics, no later than the 2005-06 school year.

“My message to the states is you need to seize this opportunity,” said Mr. Sanders of SAS inSchool, “and begin to follow the academic progress of individual kids in much more robust ways.”

The second impediment is that states must have some way to link the records of individual students over time, usually by assigning each student a unique identification number. To date, only 16 states have that capacity, although the number of such states is growing.

In September, Gov. Gray Davis of California signed legislation to set up a data system that will track the individual progress of students. SB 1453, sponsored by Democratic state Sen. Dee Dee Alpert, authorizes the use of up to $6 million in federal aid to develop the longitudinal data system.

“If we hadn’t been able to access federal money, I don’t know if I would have gotten a signature from the governor on this,” Sen. Alpert said. “This is not sexy, in the sense that people want to say they made classes smaller or gave a book to every kid, so it winds up being way down at the bottom of the list.

“Yet without it,” she continued, “you don’t actually know if the things that you’re doing programmatically are doing any good, because you don’t have any good information.”

The National Center for Educational Accountability, based at the University of Texas at Austin, is urging states to devise such longitudinal, student-level data systems. The center currently is using data from seven states to investigate what high- performing schools do that average- and low-performing schools do not.

Challenging Work

Creating data systems that can track individual students is not easy, though.

One problem is technical. Arizona, for instance, is trying to set up the Student Accountability Information System—a new, state database that will trace students’ progress. As of September, nearly 20 percent of the state’s 836,000 K-12 students weren’t enrolled in the database, in part because many districts needed to buy or upgrade software. The state is now threatening to withhold per-pupil funding from districts and charter schools that do not report the required information.

Chuck Essigs, an adviser to the state superintendent of public instruction, said Arizona districts range in size from three students to more than 70,000, and the state has hundreds of charter schools. “There’s a lot of work that has to be done on the district level to prepare their data to be properly submitted to the state, so the state can process it,” he said.

On the political side, collecting information on individual students raises privacy concerns. And some educators oppose linking student test scores to the evaluation of individual teachers.

“In some ways, those political hurdles may be more insurmountable than the technical ones,” said Ms. Hamilton of RAND.

Finally, researchers use various methods to conduct value-added analyses, and it’s not clear which approaches are best or how to improve them. Many techniques are complex and aren’t readily understood by teachers or parents.

In the summer 2002 issue of Education Next, Dale Ballou, an associate professor of economics at the University of Massachusetts at Amherst, argued, “It is much harder to measure achievement gains than is commonly supposed.” He cautioned that while value-added techniques provide a useful diagnostic tool, too many uncertainties surround such methods to use them for high-stakes personnel decisions.

‘Tougher Than We’d Like’

Among other problems, he noted, current methods of testing don’t measure gains very accurately. When statistical methods are used to minimize error or “noise,” the systems quickly become incomprehensible to educators, losing the “transparency” that many argue is a hallmark of effective accountability systems.

But Harold Doran, the director of research and evaluation for the education-performance network at New American Schools, a nonprofit group based in Alexandria, Va., said, “Adopting a less rigorous model to gain increased transparency is inexcusable.”

“I wouldn’t expect a physician to dumb down their surgical procedures simply because it’s easier for a patient to understand,” he said, “but that model should be transparent to other physicians.”

New American Schools is currently conducting a study of the different value-added models now in existence. Among other tasks, the project is analyzing data from an unnamed state to see whether the different methods produce similar or different results.

RAND also is conducting a study, underwritten by the Carnegie Corporation of New York, of the various value-added models; the challenges involved in applying them to education and, particularly, to the performance of individual teachers; and how those challenges might be overcome.

“Our stance is that this is a very promising thing to look at. We just need a lot more work to know how to do it best,” said Daniel M. Koretz, a senior social scientist at RAND and a professor of education at Harvard University. “It’s just a lot tougher than we’d like.”

Lynn Olson

Lynn Olson was managing editor of special projects for Education Week. She also covered national policy (including “P-16 issues” issues, NCLB standards, accountability, and reform), assessment and testing.

A version of this article appeared in the November 20, 2002 edition of Education Week as Education Scholars Finding New ‘Value’ In Student Test Data