At first blush, the advice from the measurement experts couldn’t be clearer: Decisions that have a major impact on a student, such as promotion to the next grade or high school graduation, “should not be made on the basis of a single test score. Other relevant information should be taken into account if it will enhance the overall validity of the decision.”
That guidance comes from Standards for Educational and Psychological Testing, the bible of test development and usage. And it’s been reiterated, in slightly different forms, by such respected groups as the National Research Council and the American Educational Research Association.
But on closer inspection, what is meant by a “single test score” and by “multiple measures” of performance is far from clear.
Yet how states make such distinctions can determine how they craft their accountability policies, and can make a crucial difference in the futures of hundreds of thousands of students.
Today, 18 states require students to pass graduation exams to earn a diploma, and six more plan to do so. Three require students to pass state tests to be promoted in certain grades, a number that is expected to rise to seven in the next few years.
As it turns out, states—and experts—disagree on the answers to such questions as:
- If a high school exit exam has separate sections in English and mathematics, and students must achieve a minimum score on each to receive a diploma, do the two pieces still represent a “single test score?”
- What if students have multiple opportunities to retake an exam?
- And if, in addition to passing a test, students must earn a specific number of course credits and attend school regularly to graduate, do those count as “multiple measures?”
Many states, for example, insist they are not relying on a single test score to bestow or withhold a diploma because students have multiple opportunities to retake mandatory exams.
In addition, states often demand that students attend school regularly and complete a minimum number of course credits to graduate, along with passing a test.
But some critics argue that, under such a scenario, each requirement essentially acts as its own single measure, because failure to achieve any one of them bars the path to a diploma.
In Massachusetts, where students must pass the Massachusetts Comprehensive Assessment System in English and math to earn a diploma starting in 2003, the Massachusetts Teachers Association has launched a $600,000 advertising campaign against what it calls the “one-size-fits-all, high-stakes, do-or-die MCAS test.”
But Jeffrey M. Nellhaus, the state’s associate education commissioner for student assessment, maintains that an “important distinction” exists between basing a diploma on a single score from a single administration of a test and allowing students to retake the exam multiple times, as is the case in Massachusetts.
Texas high school students must pass the English, math, and writing portions of the Texas Assessment of Academic Skills, or TAAS, to earn a diploma but have many chances to retake the exams.
“That is our definition of multiple measures,” said Ann Smisko, the associate commissioner for curriculum, assessment, and technology in the state education agency. “We give kids at least eight tries on this test before scheduled graduation and, in fact, kids and even young adults can continue to have opportunities to take and pass the test some other time.”
But multiple chances to retake an exam and multiple measures of performance are not the same, counters Scott R. Palmer, the deputy assistant secretary in the U.S. Department of Education’s office for civil rights.
“I think it’s pretty clear that these are different things,” he said. The OCR last month released a resource guide on the use of tests in making high-stakes decisions about students.
“The question is, if students failed the test twice, would there be some other way that they could prove that they had the competencies?” said Lorrie A. Shepard, a professor of education at the University of Colorado at Boulder. “And, if not, states really are not using multiple measures.”
In its 1999 report on high-stakes testing, the National Research Council recommended that such decisions as promotion and graduation “should not automatically be made on the basis of a single test score, but should be buttressed by other relevant information about the student’s knowledge and skills, such as grades, teacher recommendations, and extenuating circumstances.”
The report was particularly outspoken against using test scores alone to decide whether to promote students to the next grade or hold them back. But the report was less forceful in discouraging the use of exit tests for high school diplomas.
“If failing to achieve a certain score on a standardized test automatically leads to withholding a diploma,” the NRC committee wrote, “this may be inconsistent with current and draft revised psychometric standards.”
Members of the committee acknowledge that their conclusions may be confusing and that they never defined what is meant by “a single test score”.
“The truth is that I’m not sure that we were entirely clear about it,” said Robert M. Hauser, a professor of sociology at the University of Wisconsin-Madison and the chairman of the panel that produced the report. “I’ve had trouble sorting it out.”
Jay P. Heubert, the study director for the report and an associate professor of education at Teachers College, Columbia University, agrees. “I don’t think it is totally clear,” he said.
His own view, Mr. Heubert added, is that states should be using a combination of test scores and course grades to gauge whether students earn a diploma.
“What does it mean to use them both?” he said. “I think what it should mean is to use them in such a way that a high score on one criterion can outweigh a lower score on another criterion.”
That’s precisely the way that college-admissions decisions are made, Mr. Heubert said. “Someone who has straight A’s might be forgiven a somewhat lower score on the SAT, and vice versa.”
Chester E. Finn Jr., the president of the Washington-based Thomas B. Fordham Foundation and a stalwart supporter of testing, also agrees.
“Just as I don’t believe you should be admitted to college solely on the basis of your SAT score, without considering anything else about you, I don’t believe you should be given or denied a high school diploma or, for that matter, promotion to the 5th grade, solely on the basis of a number that is derived from a test,” Mr. Finn said.
He added: “I do believe, though, unlike many of the people that are saying these things, that the test could legitimately comprise a very sizable part of the decision, not a trivial part of the decision.”
Fundamentally, measurement experts are trying to guard against situations in which, for a variety of reasons, the test is not a good indicator of some individuals’ accomplishments.
Mr. Finn cites his two children as examples. One had high test scores but never applied himself in school, he said; the other worked diligently but had relatively low test scores.
“And I’m sensitive that, if we had gone on test scores alone, my relatively low-scoring kid would not have fared as well as turned out to be the case,” Mr. Finn said.
In fact, studies show that tests are not as precise or reliable indicators of performance as people might think. Students often score higher or lower than their actual achievement levels on any particular administration of an exam.
That’s why the principle of giving students multiple opportunities to retake a test is so critical, experts stress.
“The initial premise is that no single test is perfect,” said Paul R. Sackett, a member of the National Research Council committee and a co-chairman of the committee that wrote the 1999 edition of Standards for Educational and Psychological Testing. “We’ve got measurement error.”
For that reason, he said, the standards recommend that “other relevant information” be taken into account, “if it will enhance the overall validity of the decision.”
That doesn’t mean states should never use a graduation test, said Mr. Sackett, an employment-testing expert.
What it means, in his view, is that if additional information, such as teacher recommendations, is to be considered, clear evidence must exist that its use will result in making better, more valid, decisions about students. Simply arguing, on principle, that such information should be included isn’t enough, Mr. Sackett said.
Many states, in fact, adopted their graduation exams in response to concerns that the existing requirements, based on grades and teacher judgments, were too subjective.
“We’ve had multiple measures at the high school level, and the reason we’re moving away from that is because there has been some loss of confidence in what a diploma means,” Mr. Nellhaus of Massachusetts said. Both the NRC report and the Standards guide outline some of the actions states should consider if they use graduation exams:
- Students must receive adequate notice of the test and its consequences.
- Students should have an opportunity to learn the knowledge and skills being tested, meaning that the test must be aligned with the curriculum.
- The process for setting passing scores should be documented and evaluated.
- Students should have equal access to any specific preparation for taking the test.
- Students who risk failing the exam should be advised well in advance and provided with appropriate instruction, or remedial help, that would improve their chances of passing.
- Students should have multiple opportunities to retake the tests.
State courts have used similar reasoning to uphold tests linked to diplomas in Indiana and Texas.
At least for now, Ms. Smisko said, Texas is relying primarily on the courts for guidance. The state is devising new high school exit exams, which will replace the 10th grade TAAS starting in the 2002-03 school year.
Others worry that it’s precisely the fear of lawsuits that is discouraging states from designing more creative alternatives.
“I think part of the difficulty is that, if you’re going to deny a high school diploma to anybody on the basis of how well they do on any set of criteria, you need to be ready for court suits,” said Richard J. Murnane, a professor of education at Harvard University. “And I think that has led many states that envisioned a broader range of assessment criteria to back off because they found it awfully expensive and difficult to get reliable grading of, say, portfolio assessments.”
A Variety of Approaches
In the absence of clear guidance, states have taken or are considering a variety of approaches to linking tests with high school diplomas.
In Texas, for example, students who do not pass the TAAS have the option of substituting passing scores on state end-of-course exams given in algebra, biology, 10th grade English, and U.S. history. Students must earn passing grades on three of the four exams, including English and algebra.
About 3,300 students, out of a statewide graduating class of roughly 200,000, choose that route, according to Ms. Smisko.
New Jersey students who don’t do well on the state’s High School Proficiency Tests—which are slowly being phased out for more rigorous exams—can still earn a diploma by completing what is called the Special Review Assessment.The alternative assessment consists of a series of open-ended tasks, typically 45 minutes in length, that students can complete in their classrooms over the course of a semester. The state also has translated the Special Review Assessment into eight languages for students with limited English skills.
“There ought to be multiple ways for students to demonstrate their knowledge and skills,” said Robert J. Riehs, New Jersey’s acting assessment director.
New York and Virginia also have approved alternative tests that students can substitute for the state exams, such as the College Board’s Advanced Placement tests, but those typically are taken by more advanced students.
Massachusetts is considering creating a “retest” for students who initially fail the MCAS. It would focus more narrowly on the knowledge and skills at the “needs improvement” level now required to earn a diploma.
Because the state publishes most test items after each administration of the exam, said Mr. Nellhaus, the associate education commissioner, “this will give students and parents and teachers a much more precise idea of the skills and knowledge that students need to attain in order to pass.”
Delaware officials plan to use their state’s high school exams to determine the type of diploma a student receives.
“At this point, we’re talking about a distinguished diploma, an academic diploma, and a standard diploma,” said Wendy B. Roberts, the acting director of the assessment and analysis office in the state education department.
State law also requires “alternative indicators” for students who don’t do well on the exams, “but we have not gotten into the details yet of exactly what those alternative indicators will be,” Ms. Roberts added.
Students who fail Indiana’s high school graduation tests can still earn a diploma, either by achieving a minimum grade point average in a set of core academic courses identified by the state or by going through an appeals process.
Providing an appeals process, said Mr. Finn of the Fordham Foundation, is probably more realistic than expecting states to develop complex folders of student work to determine graduation for thousands of students.
California and Maryland are writing end-of-course exams instead of a single high school exit test.
Last month, a commission in Ohio recommended that the state give students at least two ways to demonstrate they have met standards: either a series of end-of- course exams or a cumulative high school achievement test.
Wisconsin, meanwhile, has left it up to districts to establish the criteria for earning a diploma based on multiple measures, including test scores, grades, and teacher discretion.
And Wyoming officials want each school to collect a body of evidence that would prove a student has met the state’s graduation standards.
Rather than withhold diplomas, some states provide students with incentives to do well on state tests.
Six states, for example, offer scholarships to students who receive high scores. Officials in those states believe such policies eliminate the concern that teenagers won’t be motivated to do their best and that the results won’t reflect what they’ve actually learned unless some stakes are attached to student performance.
In the end, the 1999 NRC report suggested, knowledge about what constitutes best practice for high school graduation testing may still be years away.
Research is needed, the panel argued, on the effects of high-stakes graduation tests on teaching, learning, and high school completion; on alternatives to denying students a diploma based on test scores; and on the effects of different kinds of high school credentials on employment and other postschool outcomes.
“We do not know,” the committee wrote, “how best to combine advance notice of high- stakes test requirements, remedial intervention, and opportunity to retake graduation tests.”
A version of this article appeared in the January 10, 2001 edition of Education Week as Test Debate: What Counts as Multiple?