Special Report
Federal

Stimulus Seeks Enriched Tests

By Stephen Sawchuk — August 11, 2009 8 min read
  • Save to favorites
  • Print

No matter where teachers, state officials, and testing experts stand on the debate about school accountability, they generally agree that the United States’ current multiple-choice-dominated K-12 tests are, to use language borrowed from the No Child Left Behind Act, in need of improvement.

Now, federal officials are signaling that they expect the caliber of testing to change.

U.S. Secretary of Education Arne Duncan recently announced that he will set aside $350 million of the $4.35 billion in discretionary aid in the Race to the Top Fund to improve assessments.

Testing experts say that money could serve as a down payment for scaling up tests that would better measure students’ critical-thinking skills and improve teacher and student engagement in the assessment process. The catch, they warn, is that truly achieving that goal may force federal officials to rethink the current parameters around assessment and accountability in the NCLB law.

“Accountability testing is seen as a necessary evil to be minimized. It’s like going to the dentist. You have to do it, but it hurts,” said Randy E. Bennett, a distinguished presidential scholar at the Educational Testing Service, a nonprofit testing and research organization based in Princeton, N.J.

“The goal,” he said, “should be to make even the test as much of a learning experience as possible, so the student actually benefits from taking it, and teachers are given some important information for the purposes of instruction.”

Measuring Performance

The image of fill-in-the-bubble items has become all but inseparable from the NCLB law, which requires annual testing in grades 3-8 and once in high school.

Multiple-choice items can efficiently discover whether a student has assembled discrete pieces of knowledge across a subject. The results are typically highly reliable, meaning the error associated with the results is low—a desirable quality for high-stakes tests. And they are easy and cheap to score.

Such tests, though, are not ideal for identifying whether students can take multiple pieces of domain-specific knowledge and analyze, integrate, and apply them in unfamiliar contexts, Mr. Bennett said. And researchers familiar with international benchmarking argue that those critical-thinking skills are precisely the type that will be in demand as the global economy becomes increasingly knowledge-oriented.

“I think the tragedy is that things that are easy to test and teach lose relevance,” said Andreas Schleicher, the head of the indicators and analysis division for the Paris-based Organization for Economic Cooperation and Development. The OECD sponsors the Program for International Student Assessment, or PISA, which includes performance-based items.

“The feature that is central to PISA is that we’re not that interested in whether students can reproduce content knowledge,” Mr. Schleicher said, “but whether they can extrapolate what they know and apply it in novel situations.”

Sample Assessment

The proposed application requirements for the Race to the Top Fund define a “high quality” assessment as one that uses “a variety of item types, formats, and assessment conditions,” including performance-based tasks, to measure student achievement.

College- and Work-
Readiness Assessment:

You advise Pat Williams, the president of DynaTech, a company that makes precision electronic instruments and navigational equipment. Sally Evans, a member of DynaTech’s sales force, recommended that DynaTech buy a small private plane (a SwiftAir 235) that she and other members of the sales force could use to visit customers. Pat was about to approve the purchase when there was an accident involving a SwiftAir 235. Your document library contains the following materials:

• Newspaper article about the accident

• Federal Accident Report on in-flight breakups in single-engine planes

• Internal correspondence (Pat’s e-mail to you and Sally’s e-mail to Pat)

• Charts relating to SwiftAir’s performance characteristics

• Excerpt from magazine article comparing SwiftAir 235 with similar planes

• Pictures and descriptions of SwiftAir Models 180 and 235

Sample Questions:
Do the available data tend to support or refute the claim that the type of wing on the SwiftAir 235 leads to more in-flight breakups? What is the basis for your conclusion? What other factors might have contributed to the accident and should be taken into account? What is your preliminary recommendation about whether or not DynaTech should buy the plane and what is the basis for this recommendation?

SOURCE: Council for Aid to Education

Performance-based tests designed to measure those abilities are common in specialized fields such as medicine, which requires examinees to diagnose and treat simulated patients, for example. But the exams typically require scoring by humans, and for that reason are costlier than those that use exclusively multiple-choice questions. They also produce results that paint a deeper picture of students’ understanding but are less mathematically reliable than multiple-choice tests.

Issues of both cost and reliability, testing experts say, explain why extended performance-based tasks have not penetrated K-12 assessment under the NCLB law.

What now seems to be an intractable choice between richer tasks and reliable data, though, could be mediated by advancements in technology that could improve access to, cost, and reliability of performance-based testing, some experts argue.

And the federal funding, they say, could be the lever to support that work.

“It’s expensive to put [new item formats] into practice, and to the extent that infusion can help create not only prototypes of promising assessment but support some of the infrastructure needed to deliver them efficiently, [it] will be an important legacy,” Mr. Bennett said.

Federal officials have not yet revealed the details on the funding, which will be awarded to states as part of the Race to the Top fund. But Secretary Duncan has intimated in public appearances that the funding will support assessments aligned to the common core of standards now in development.

Some standardized performance-based examples already exist, such as the College and Work Readiness Assessment, a computer-based test that is given primarily to high school freshmen and seniors in private schools.

The exam, run by the Council for Aid to Education, a New York City-based nonprofit group that works to improve access to higher education, includes a task that requires students to sift through various texts and sources of data and draw conclusions from them to support an argument.

“By and large, the real world doesn’t present itself as nice little abstract tasks with four options that you choose from,” said Richard J. Shavelson, a professor of education at Stanford University who helped design the assessment.

The high costs of scoring a complicated assessment with an almost unlimited number of answers, he added, could be mitigated by advancements in natural-language-processing software—essentially programming that proponents claim can judge written essays as accurately as human readers and reduce, though not eliminate, the need for costly human evaluation.

In addition, experts say, technology offers the ability to measure student understanding of concepts and processes involving critical thinking that have been notoriously difficult to assess using only multiple-choice items.

For the 2009 National Assessment of Educational Progress in science, officials assessed a subset of students using “interactive computer tasks.” Those items require students to engage in the entire process of scientific inquiry, in which they must participate in a simulated experiment, record data, and defend or critique a hypothesis.

One of the benefits of the computer-based tasks, said Mary Crovo, the deputy staff director of the National Assessment Governing Board, which sets policy for NAEP, is that computers can simulate tools that would be dangerous or impractical to replicate in an assessment context, or processes such as evolution that occur over long expanses of time.

Improving Instruction

Experts add that the infusion of federal cash could also provide more opportunities to devise tests that will better engage teachers in the cognitive science about how knowledge develops.

“We know that it’s not only the amount of knowledge that’s important, but the way it’s organized, and we don’t test knowledge organization at all, at least not directly,” Mr. Bennett said.

One potential prototype for such a system is the ETS’ Cognitively Based Assessment of, for, and as Learning. The reading, writing, and mathematics tests incorporate the knowledge and skills that students must master to succeed in more-complex tasks.

An assessment on fictional reading might ask students to diagram the various structures of the plot, such as the conflict, rising action, and conclusion, before moving on to an analytical open-ended question.

The ETS assessment also will include subunits that teachers can use in a non-high-stakes setting to help students home in on prerequisite content and skills.

In Portland, Maine, where the ETS has developed and field-tested the cognitive-based system in collaboration with teachers in three middle schools, officials praised the level of teacher involvement in its design.

“The landmark piece of this whole project is how much teachers have helped design these assessments,” said Tom Lafavore, the district’s director of educational planning. “We are breaking down the bigger skills into smaller ones that we can check along the way.”

Still, assessment experts express some wariness about the new federal funding, saying it might not improve test design unless U.S. officials also consider the context in which such new assessments might be used.

If measures of higher-order, critical-thinking skills are to be part of an accountability system, for instance, federal officials will probably need to reconsider aspects of the No Child Left Behind law, they said.

“If I told you to develop a much more energy-efficient car but you can’t change the materials, the engine, and the fuel it uses, you’re not going to get very far,” said Bill Tucker, the chief operating officer of Education Sector, a Washington-based think tank that has issued reports on advanced testing techniques.

Psychometricians point in particular to the constraints on testing placed by the federal law, which requires 95 percent of all students in each grade and each ethnic subgroup to be assessed. For efficiency, cost, and security reasons, each state typically conducts all its testing on the same day, in a narrow time frame.

“I think one thing that’s got to give is the idea of a short test,” said Mr. Bennett of the ETS. “You can’t cover a domain broadly, or enough of a domain deeply, if you give a short test, and you can’t give back information that’s going to be valuable to the teacher or student in terms of what to do.”

It might be possible to administer assessments in parts over the course of the year and to aggregate the results, rather than simply create longer tests, he suggested.

Another possible solution, experts say, would be to move to a system that samples student performance, rather than giving every student the same test form. Each student would take only a part of the exam, with results aggregated at a higher level.

But such a system has not been used for school accountability purposes, and would contravene the NCLB requirements that all students in a state take the same test. It would also complicate efforts to break out schools’ test-score results by racial or ethnic and income-level categories, among other areas.

“It’s a question of what your purpose is,” said Brian Stecher, the associate director of education at the RAND Corp., a Santa Monica, Calif.,-based research and analysis group. “If you’re monitoring how well the system is performing, you don’t need a score on every kid. I think there is a way to strike a better balance.”

Ultimately, experts say, the federal agenda for the funding will probably determine its utility.

“Unless they’re very clear about the uses—accountability, instruction, evaluation—it’s very easy for this to get corrupted,” said Scott Marion, the associate director of the Dover, N.H.-based Center for Assessment, a test-consulting group. “I think you can easily waste this money if you’re not really careful about it.”

Related Tags:

A version of this article appeared in the August 12, 2009 edition of Education Week as Stimulus Seeks Enriched Tests

Events

This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
College & Workforce Readiness Webinar
The Road to Opportunity: Making CTE Accessible for All
The most valuable CTE happens off campus. For too many students, transportation is the barrier that keeps opportunity out of reach.
Content provided by HopSkipDrive
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
Recruitment & Retention Webinar
New Hire, No Laptop, No Login: Preventing Day-One Disruption
What happens before day one matters. Discover how districts are improving the new hire experience.
Content provided by Frontline Education
Teaching Profession K-12 Essentials Forum Supporting the New K-12 Workforce: What Teachers Need to Stay at School
 Join this free virtual event to discover what teachers say they need to feel supported to stay in classrooms for the long haul.

EdWeek Top School Jobs

Teacher Jobs
Search over ten thousand teaching jobs nationwide — elementary, middle, high school and more.
View Jobs
Principal Jobs
Find hundreds of jobs for principals, assistant principals, and other school leadership roles.
View Jobs
Administrator Jobs
Over a thousand district-level jobs: superintendents, directors, more.
View Jobs
Support Staff Jobs
Search thousands of jobs, from paraprofessionals to counselors and more.
View Jobs

Read Next

Federal Special Ed. and Civil Rights: What We Know About the Ed. Dept.'s Latest Moves
Special education is moving to HHS, and civil rights enforcement is moving to DOJ.
6 min read
Letters on the Department of Education building are missing after removal of America 250 banners, which included those of Booker T. Washington, Catharine Beecher and Charlie Kirk, March 18, 2026, in Washington.
Letters on the U.S. Department of Education building are missing in this March 18, 2026, photo in Washington. The agency last week announced it's transferring day-to-day management of special education and civil rights enforcement to different Cabinet agencies, the latest push by the Trump administration to dismantle the Education Department.
Allison Robbert/AP Photo
Federal Trump's Justice Dept. Investigates Dozens of Districts Over LGBTQ+ Curricula
The investigations target how schools discuss sexuality and gender identity and whether parents can opt their children out of lessons.
8 min read
The U.S. Department of Justice is investigating how 43 school districts in three states teach about sexuality and gender identity and whether they give parents the opportunity to opt their children out of lessons that conflict with their religious beliefs on June 16, 2026.PICTURED, Protesters gather outside the Glendale Unified School District headquarters in Glendale, California, on June 20, 2023. Over 300 people gathered outside the Glendale Unified School District headquarters, as protests continued over the issue of teaching children about same-sex parents and queer issues.
Protesters gather outside the Glendale school district in Glendale, California, on June 20, 2023 over the issue of teaching children about same-sex parents and queer issues. The U.S. Department of Justice is now investigating three other school districts over LGBTQ+ themes in sex ed. and beyond. (The Glendale district is not one of them.)
DAVID SWANSON / AFP via Getty Images
Federal Education Department Moves Special Ed. and Civil Rights to Other Agencies
Special education programs help schools serve more than seven million K-12 students with disabilities nationwide.
9 min read
A banner featuring a photo of President Donald Trump hangs outside the Department of Justice in Washington on Monday, June 15, 2026.
A banner featuring a photo of President Donald Trump hangs outside the Department of Justice in Washington on Monday, June 15, 2026. The U.S. Department of Education is moving its office for civil rights to the Justice Department as part of a fresh wave of outsourcing.
Bill Clark/CQ Roll Call via AP
Federal Trump's Ed. Dept. Backs Away From Addressing Civil Rights for Black Students
Civil rights attorneys describe the administration’s actions as an inversion of legal history.
6 min read
Thomas Chalmers Public School sign is seen outside of school in Chicago, Wednesday, July 13, 2022. America's big cities are seeing their schools shrink, with more and more of their schools serving small numbers of students. Those small schools are expensive to run and often still can't offer everything students need (now more than ever), like nurses and music programs. Chicago and New York City are among the places that have spent COVID relief money to keep schools open, prioritizing stability for students and families. But that has come with tradeoffs. And as federal funds dry up and enrollment falls, it may not be enough to prevent districts from closing schools.
Children are seen outside the Thomas Chalmers Public School in Chicago on July 13, 2022. Under the Trump administration, efforts to address deep-rooted inequities for students of color are being cast as discriminatory against white students. The administration withheld more than $20 million from Chicago schools when the district refused to end its Black Student Success Program.
Nam Y. Huh/AP