In Search of Better Assessments
By their nature, state testing systems are often wider than they are deep, shaped by compromises between a variety of limiting factors, notably cost and time.
Even so, experts say, there are several steps states can take to make their assessments more useful instruction ally and less prone to corruption.
First, states can clearly delineate the range of subject matter that a test is supposed to measure, then devise questions that cover that entire domain. Rich, comprehensive coverage encourages teachers and students to pay more attention to the entire curriculum and not just to what is being tested.
Most state tests measure very small samples of content and skills from very large domains--such as all of the mathematics learned in grades 5-8. And the content standards themselves are often vague and poorly defined.
Daniel Koretz, a professor of education at BostonCollege and a senior social scientist with the Santa Monica, Calif.-based RAND Corp., says states might consider making the domain much narrower and the assessment much broader. For example, he says, instead of testing the whole range of high school math, test students in algebra with an end-of-course exam. Then, Koretz adds, "don't test algebra with 10 items. Test it with maybe a bank of 250 items, not all of which you'd use every year."
Both North Carolina and Virginia are phasing in statewide end-of-course exams that high school students will have to pass to earn a diploma. New York state has traditionally given such tests, known as the regents' exams, to its college-bound students.
The state is now revising those tests by tying them to the state's new academic standards and requiring all students to take them. Maryland also is developing end-of-course exams that high school students would have to pass to graduate.
But it's much more expensive to produce many of those exams than a single survey test. Many states also are reluctant to spell out their expectations for students in too much detail because of concerns that prescriptions would result in a centralized curriculum.
Traditionally, Americans have worried that such a curriculum would usurp local control over schools and prevent educators from meeting the individual needs of students.
"I think the price of a centralized, test-based accountability system is you have to have a central curriculum," says Koretz, who notes that there's nothing in the research to suggest whether such a curriculum would be good or bad.
There are also steps states can take to make sure their assessment systems are connected with classroom work.
For example, they can include teachers or instructional specialists on test-development committees. They can have teachers mark student papers and grade assessments as a way to improve their instructional knowledge. They can train teachers in how to interpret test scores.
States also can collect data on school practices-such as course-taking patterns or homework assignments-so that they can begin to draw connections between test results and instruction. The National Assessment of Educational Progress regularly surveys students, their teachers, and their principals about the practices in their schools as a way to help understand test scores.
In Maine, more than 250 teachers participated in the creation of new state assessments that will be field-tested this year. And the state plans to train teachers to score the responses.
"It's helped give a real instructional focus to the work we're doing," says Horace "Brud" Maxcy, the coordinator of the Maine Educational Assessment Program.
The state also plans to make students' scored work available to their teachers electronically, at least for those test items that all students take in common.
In addition, states such as Maine and Oregon are trying to ensure greater coherence between the state tests and those given by individual schools and districts. In Oregon, students' ability to meet state standards will be measured through a combination of state and local exams.
Maine officials are trying to make sure that standards-based testing is not a once-a-year event by training teachers and administrators to design or purchase assessments for local use that also reflect the standards.
Experts also stress that assessment reform should be an open and inclusive process that involves a broad range of citizens and educators.
In 1998-99, for the first time, every public school in Vermont must participate in state testing in English/language arts, math, and science.
This past fall, an insert appeared in newspapers around the state that explained the testing program, talked about the results, and provided sample questions and answers.
The state also has trained individual schools in how to use the test results to devise improvement plans. And districts regularly hold "school report nights" to talk about assessment results and display students' work to the community at large.
The Kentucky legislature last year created three new entities to help oversee the development and use of state assessments: a council appointed by the governor comprising educators, business leaders, parents, and other citizens; a national panel of technical experts; and a new legislative-oversight committee. States also need to ensure that tests are used appropriately, experts say.
The National Research Council suggests the creation of independent oversight bodies, deliberative forums, truth-in-labeling laws, and increased federal regulation as four options for encouraging appropriate use of assessments. For example: Has the test been validated for the purposes for which it's being used? What is an acceptable margin of error in reporting test results? Are important decisions about students, such as whether they will be promoted to the next grade, made on the basis of more than a single test score?
Finally, experts say, state tests should be subject to evaluations that look for evidence of fairness, validity, and reliability, as well as positive and negative effects on classroom practice.
Vol. 18, Issue 17, Page 17Published in Print: January 11, 1999, as In Search of Better Assessments