“Accessible” or “universally designed” tests strive to measure content area knowledge without throwing unneeded obstacles in the way of students with disabilities or students who are mastering English.
But at the same time, the tests can’t be so simple that general education students can skate through without having to demonstrate their competence.
Walking that line between accessibility and test validity is a challenge. But researchers at Vanderbilt hope to provide help through a checklist of questions they developed for test creators to ask themselves.
By following their field-tested guidelines, they say, they can eliminate common problems that make tests less accessible.
This question from a 7th grade math test shows how the principles of universal design can be used to modify questions for students with disabilities, or students learning English, without losing the meaning of the test items.
SOURCE: Vanderbilt University
For example, many multiple choice tests offer three wrong answers, or “distractors,” along with the correct answer. But having two distractors instead of three doesn’t seriously detract from the difficulty of the test, and is less likely to trip up students for reasons unrelated to their knowledge of the material, the researchers say.
Test creators also commonly use artistic elements, like pictures or cartoon images, in an attempt to enhance the look of the test. But students may think those stray illustrations hold some clue to the final answer. It’s better to use illustrations only when directly related to the answer, the checklist states.
“A lot of this stuff is somewhat intuitive,” said Peter A. Beddow, the senior author of the checklist, which has been named the Test Accessibility and Modification Inventory, or TAMI. “But when you’re taking these original items, you almost need permission to go as far as you need to go.”
Take, for instance, a question with a long poem that is designed to test a student’s vocabulary skills.
“Often, it really could be vastly shortened. The writing might not be as elegant, but the question is, does this item measure the construct it is supposed to measure?” said Mr. Beddow, who is a doctoral candidate at Vanderbilt University’s Peabody College of Education and Human Development in Nashville, Tenn.
Stephen N. Elliott, a professor of education at Peabody, and Ryan J. Kettler, a research assistant professor in the special education department, also worked on the development of the checklist. Their work was funded by a grant from the U.S. Department of Education as part of the Consortium for Alternate Assessment Validity and Experimental Studies.
The checklist is key, Mr. Elliott said. “People need a structure. This was a lot of inventions born out of necessity,” he said.
The testing requirements of the federal No Child Left Behind Act are a major driver of the need for accessible tests. The law’s regulations allow some students with disabilities to take different types of assessments than general education students. Two percent of all students, or about 20 percent of students with disabilities, can be counted as proficient when they take alternate assessments based on modified, but grade-level, academic standards. Those tests can have fewer questions, fewer choices in a multiple-choice section, and require a lower level of reading skill.
The Peabody test inventory, still in its early stages, can hopefully help test writers create assessments that meet those standards, Mr. Beddow said. The researchers developed their work by using universal design principles and “cognitive load theory,” which refers to how much information a person must hold in their mind to perform a task.
If a test isn’t intended to be assessing a student’s memory, then cognitive load should be reduced, the Peabody researchers suggest, even if that means cutting a long poem down to a few stanzas. “Extraneous cognitive load can throw [students] off,” Mr. Beddow said.
But to make these changes, test creators sometimes have to set familiar concepts aside. People tend to use three distractors on a test because that’s the way it’s always been done, Mr. Elliott said. Some college professors may even use four or five, including such trip-up questions as “none of the above” or “all of the above”—and they pass those habits on to teachers-in-training, who may use those same types of questions in their own classroom tests.
“Complicated does not necessarily translate to a better test,” Mr. Elliott said.
The checklist could be useful for training test item writers or state content teams, said Scott Marion, the associate director for the Dover, N.H.-based Center for Assessment, a nonprofit organization that works with states and districts to improve their testing-and-accountability systems.
However, shortening test items could be contrary to another theory of test creation, which suggests that students do well with questions that include lengthy real-world examples.
“There’s got to be a balance,” Mr. Marion said. He also questions the researchers’ rating system for test items, and the suggestion that all of the checklisted items are of equal importance.
“If you’re trying to put numbers on things and treat them all as equal, I’m not sure that’s correct,” he said.
The Vanderbilt researchers have said that the inventory is in its early stages, and that they are seeking feedback from practitioners to make the checklist better.
The work at Vanderbilt is part of an active research effort surrounding the creation of accessible tests.
Jamal Abedi, a professor of education at the University of California, Davis and a researcher with the Los Angeles-based National Center for Research on Evaluation, Standards, and Student Testing, is part of a team developing accessible tests for reading comprehension.
Tests of reading comprehension offer a specific challenge, Mr. Abedi said, because changing the language of a text passage—for instance, using simpler vocabulary words—may actually change what the test is supposed to measure.
The researchers have found some changes that help all students, without the concern of making the test inappropriately easy, Mr. Abedi said. For example, Mr. Abedi and his colleagues have experimented with sprinkling questions throughout a long text passage, rather than leaving all of the questions at the end.
Using that method, students only need to read a few paragraphs at a time before answering questions related to that section. The only questions that would be left at the end of the passage are summary-type test items, Mr. Abedi said.
“This approach increased reliability without affecting comprehension,” said Mr. Abedi.
Other parts of the research group that Mr. Abedi is working with are exploring such options as allowing students to select the passage they want to read out of several choices. The rationale is that students will be less frustrated or distracted if they’re reading material that interests them.
Mr. Abedi said the goal of the team’s research is to develop tests that are appropriate for all students, not just those with disabilities or English language learners.
Though the work at Peabody has its genesis in the special education field, Mr. Beddow hopes that the checklist could be used to create better tests for all students.
“My hope is, as a field, we’re shifting our perspective in writing tests,” he said.
A version of this article appeared in the October 22, 2008 edition of Education Week as Researchers Piloting ‘Accessible’ Guidelines