Assessment Experts Fret Over Limitations Of High-Stakes Testing
Many of the nation’s top testing experts gathered here recently to ponder the question: “Benchmarks for Accountability: Are We There Yet?” The experts say no, even though they know policymakers want them to say yes.
Researchers spent most of the two-day conference explaining the technical limitations of test-based accountability. But they need to stop being “academic naysayers,” Lorraine M. McDonnell, a political science professor at the University of California, Santa Barbara, told more than 300 researchers, state testing directors, and local officials at the annual conference of the National Center for Research on Evaluation, Standards, and Student Testing, or CRESST.
Instead, the experts should help state leaders design the best possible systems, she said.
Another prominent researcher added that politicians are so committed to the movement for educational accountability that they will charge ahead despite any warnings or opposition from experts.
“This movement is rooted in state politics,” said Richard F. Elmore, a professor at the Harvard University graduate school of education. “It is extremely resilient and durable. It ain’t going anywhere but straight ahead.”
Still, at several points, researchers at the Sept. 16-17 gathering on the campus of the University of California, Los Angeles, expressed doubts about the accountability systems that depend on rewards and sanctions.
“What we see in Kentucky is behaviors geared toward changing scores rather than behaviors geared toward changing what students do,” said Brian Stecher, a senior social scientist for the RAND Corp., a Santa Monica, Calif.-based think tank.
Accountability systems don’t have to manipulate educators by proffering rewards and penalties-- as Kentucky and many other states do--for them to be successful, countered Lauren B. Resnick, a professor at the University of Pittsburgh.
Ms. Resnick found in her work with the Pittsburgh schools that simply requiring principals to embrace new policies by sending teachers to professional-development programs yielded test-score increases.
“The accountability that might change Pittsburgh’s performance,” she said, “is probably something as simple as supervisors of each principal saying: ‘I want you to tell me how many teachers are participating in professional development and how may are using the materials. For any who aren’t doing those things, we need an explanation why.’ ”
In California, two CRESST researchers are taking the advice to get involved in designing the state’s accountability system, even if they have doubts about it. Eva L. Baker, a co-director of the federally subsidized center and a professor at UCLA, where CRESST is headquartered, and Edward H. Haertel, a Stanford University professor, are co-chairing the team that is designing a new accountability system for the state.
The team is preparing a system to reward schools that improve student performance while narrowing the gap between low-achieving and high-achieving schools.
The group plans to propose to the state board of education an index that would set goals for all schools to reach. But the state’s lowest-performing schools would be asked to raise test scores at a faster pace.
If schools hit their targets, it would mean that the lowest scorers had closed in on their peers who had previously outscored them, William L. Padia, the director of the office of policy and evaluation for the California education department, told the conference-goers.
The system’s designers said that even if the overall system is well-crafted, its success would depend on the state’s addition of achievement measures other than the scores from the Stanford Achievement Test-9th Edition--the standardized test given annually to every 2nd to 11th grader in California.
The Stanford-9 is not directly tied to state standards, and so is more likely to reflect students’ demographic backgrounds than what they’ve learned in schools, Ms. Baker told one session.
The ultimate success of such an accountability system would rely on the creation of a valid test to supplement the Stanford-9 results, she said. The state gave such a test last spring, but its results haven’t been validated for use in the accountability system.
--David J. Hoff