South Dakota officials announced this past winter that they were making the state’s much-touted online testing program voluntary for districts, and instead requiring new paper-and-pencil tests to meet the requirements of the “No Child Left Behind” Act of 2001.
Meanwhile, officials in Idaho are forging ahead with a statewide online testing system, which they have modified to meet the requirements of the federal law. The contradictory moves illustrate the push and pull that the law, a wide-ranging revision of the Elementary and Secondary Education Act, could have on the growth of technology-based assessments in K-12 schools. While many experts predict the law could lead to a moderate slowing-down of computer-based testing at the state level, they anticipate a boom in low-stakes, technology-based assessments at the district level to help students prepare for the state exams.
“I think No Child Left Behind poses an interesting situation, because it could be both an impediment and an impetus to the use of technology-based tests,” says Randy Bennett, an expert on technology and assessment at the Educational Testing Service in Princeton, N.J.
Technology can affect virtually every aspect of assessment, from test design and administration to the scoring and reporting of results. Advocates argue that technology could make it easier for states and districts to meet some of the federal law’s requirements, by providing for cheaper and more efficient test delivery, a quicker turnaround of scores, and the ability to analyze and report test data in new ways.
Technological innovations also could make state tests more accessible to special populations of students, such as those with disabilities or limited fluency in English, who must be included in state testing systems under the federal legislation.
Eventually, experts predict, technology could change the face of testing itself, enabling states to mesh the use of tests for instructional and accountability purposes.
“You’ve got the potential that technology could be a solution,” says Wesley D. Bruce, the director of school assessment for the Indiana Department of Education, “but there is, right now, just a huge set of issues.”
An Immediate Concern
One of the most immediate issues for states that have not yet ventured into computer-based testing is cost, particularly given ballooning state deficits.
“Many states don’t believe that they’re going to be able to meet the federal government mandate for assessments under No Child Left Behind with the funds that the act is providing,” Bennett of the ETS says. “So, given that, to then go out and try to build an assessment system that’s going to require even more upfront investment is just not going to be very attractive to anyone.”
In addition, Bennett notes, the federal law’s aggressive timelines—states must administer reading and math tests in grades 3-8 and at least once in high school by 2005-06—work against the “multiyear ramp-up” required to do high-stakes testing on a computer.
“For that reason, I think it’s going to slow things down from the point of technology-based testing,” he says, “at least on the accountability end.”
Not everyone shares that view.
Scott Elliot, the chief operating officer for Vantage Learning—a for-profit company based in Yardley, Pa., that provides online testing and scoring services for both precollegiate and higher education—says his company expects to give some 17 million tests online this year.
Although it’s not atypical for a state to spend $8 to $10 per student to administer a paper-and-pencil test, he estimates the same test could be given on a computer for $5 to $6 a student. And while the cost of human scorers for essay and open-ended questions keeps rising, the cost of automated scoring for such questions will probably stay the same or decrease.
While Elliot initially was concerned that the weak economy would deter states from venturing into technology-based assessment, he now argues otherwise. “I just think it’s going to break wide open in the next few months with state interest,” he asserts. Demand, he reasons, will be driven largely by the need to produce test results faster, cheaper, and more efficiently
“I’m getting more interest than ever before,” Elliot says, “and I think it makes sense.”
Richard Swartz, a senior research director at the ETS, estimates that the actual costs of putting a test online and building a customized scoring model are comparable to those of developing a good paper-and-pencil exam.
But, he adds, “once the tests are implemented, the difference in scoring costs is enormously in favor of the computer,” particularly when it comes to the computerized scoring of essays and open-ended questions—an area in which both the ETS and Vantage have worked extensively.
Last year, Indiana used the ETS’s automated-scoring technologies for an online pilot of an end-of-course English exam for high school students, which included both essay and short-answer questions.
“We compared the cost of administering paper-andpencil tests and scoring all of the open-ended responses by hand to administering the tests on computer and scoring everything with computer,” says Swartz, “and the computer-administered and computer-scored version cost about a quarter of what the paper-and-pencil version costs.”
What’s more, students received the results in a matter of days, rather than months.
The ETS official says more states and districts are now open to technological solutions to their testing needs, “where we never saw that before.”
“Before, it was strictly, ‘We want paper-and-pencil,’” Swartz says. “Now, they’re saying, ‘If you’ve got a technology solution to propose, feel free.’ ”
Tight budgets definitely have put a crimp in state plans, however.
Indiana had hoped to create an online item bank for teachers to craft classroom assessments linked to state standards. But the plan—which would have cost about $800,000 a year for two years to generate a deep enough pool of test questions—is on hold.
“We certainly haven’t given up on that,” says Bruce, the state testing director, “but it’s going to be slightly delayed.”
Adapting to Roadblocks
In Oregon, work on an online writing assessment has come to a halt because of budget cuts. Schools already can opt to administer state reading and math tests in grades 3, 5, 8, and 10 either electronically or on paper through the Technology Enhanced Student Assessment program, or TESA, and the state will pilot additional online tests in other grades this spring.
“We’re able to implement tests at new grades more quickly through TESA because essentially all the infrastructure is right there,” says Bill Auty, the associate superintendent for the state education department’s office of assessment and evaluation.
But he adds: “We’re fortunate that we started when we did. We could not start this up now. Our state funds for assessment have been cut considerably.”
States that have ventured into online testing have already confronted one roadblock under the federal law: a mandate that states measure student performance against the expectations for a student’s grade level.
In 2002, that requirement forced Idaho officials to modify their plans for an online “adaptive” testing system, devised by the Portland, Ore.-based Northwest Evaluation Association. The system would have permitted students to take tests harder or easier than their actual grade levels, based on their ability to answer an initial set of test items. As a compromise, state officials agreed to give all students a uniform, grade-level examination in addition to a more individualized set of test questions that could provide better diagnostic information.
Then, in January, South Dakota officials announced that they were making their state’s adaptive, online testing program—under contract to the Scantron Corp. of Irvine, Calif.—voluntary for districts and instead requiring new paper-and-pencil tests to meet the federal requirements.
Wade Pogany, the director of education services for the state education department, says the fact that the online tests—the Dakota Assessment of Content Standards, or DACS—were adaptive was not the primary reason for moving to a new testing program.
“I don’t ever want to give the impression that South Dakota does not like computer-adaptive tests,” he says. “We’re a big supporter of that. But there are some issues with the new legislation that caused us to look at a fixed-form test in relation to our standards,” including the requirement that students be tested at grade level.
South Dakota, Pogany adds, is trying to “use the best of both worlds” by continuing to finance the DACS for districts that want to use it, while coming up with a new set of exams that will allow the state to measure student performance against both state standards and national norms.
Those experiences have clearly caused states to think twice about venturing into adaptive, online assessments. Oregon had planned to add an adaptive feature to its Web-based assessment program this school year, but delayed it in part because of budget cuts and development work.
“We will implement that next year,” says Auty, the associate superintendent for assessment and evaluation. “It’s something we’re looking into.”
He adds, though: “We want to be very clear, given the controversy, we’re talking about adaptive testing in the grade level.”
Others argue that adaptive testing could meet the requirements under the federal law, but that it would take more extended discussion and explanation from testing experts.
A ’Major, Major Issue’’
A far bigger impediment for state testing systems is the infrastructure needed to deliver a secure computer-based exam to thousands of students under the same conditions at the same time.
“The major, major issue is crystal clear, and that’s the infrastructure,” argues David J. Harmon, the director of testing for the Georgia Department of Education.
Georgia had been planning to give districts the option of offering the state’s Criterion-Referenced Competency Tests in grades 1-8 in a paper-and-pencil or Web-delivered format this spring. But the state was forced to suspend online testing this year after discovering that some 270 actual test questions were publicly available on an Internet site for students, parents, and teachers. The state had developed the Web-based test-item bank to help schools prepare for the exams.
Kathy Cox, the state superintendent of education, says “a miniscule amount” of school districts even wanted that online option, about half a dozen out of 181 districts.
But in Oregon, where tests are available online, Auty estimates about one-third of schools use the Technology Enhanced Student Assessment program electronically.
In Virginia, where districts can choose to offer most high school end-of-course tests online or on paper, Mark J. Schaefermeyer, the associate director of Webbased assessment in the state department of education, estimates that up to 85 of 135 districts will choose the Web-based version this spring.
Here’s the challenge: High-stakes tests, such as those in Virginia, which are used to make key decisions about individual students or schools, must be given under secure conditions so that there are few opportunities for cheating. They’re typically administered during a limited time window. And they are supposed to be given under the same conditions to every student.
“So how many computers does it take?” asks Harmon. “I think that’s the biggest problem.”
In addition to the upfront and ongoing costs of buying and maintaining hardware and software systems, computers and Internet connections don’t always function dependably. And testing sessions may be interrupted or proceed so slowly that the conditions interfere with student performance.
“Capacity is the number-one issue; compatibility is probably number two; and keeping your lines open is number three,” says Bruce of Indiana.
This year, Indiana plans to pilot both end-of-course English and algebra exams exclusively online for high school students. “One of the things we ran into in one of the largest high schools in the state where they wanted to do this,” Bruce says, “is that in order to give the algebra test, they had to close down four computer labs for two weeks. It took that capacity.”
Eventually, Georgia’s Harmon predicts, the solution may lie with small, handheld, wireless devices that look like GameBoys, but with larger screens. “Given something like that,” he says, “then there will be very substantial savings” compared with the printing, shipping, and storage costs of a traditional exam.
’The Wrong Question’’
On the measurement side, the biggest challenge for state testing programs is the “comparability” of test results. To compare performance fairly, each student should be tested under the same conditions. If there are any variations, they shouldn’t affect performance.
But Bennett of the ETS points out that, in reality, the equipment used in statewide computerized-testing programs often varies from one school to the next, and sometimes even from one machine to the next within the same building. Researchers don’t yet know how to adjust for such variations in scoring results.
In states such as Virginia, where some students will take a test online while others take a paper-and-pencil version, policymakers need assurances that neither group is particularly advantaged or disadvantaged by the mode of delivery.
A study by researchers at Boston College, for instance, found that students accustomed to writing on computers scored higher on test questions answered electronically than on those answered by hand, while those who were used to writing in longhand scored worse than similar students who took the exam on paper.
Although several testing companies report they have conducted comparability studies, few have been published. Moreover, observes Wayne Ostler, the director of eMeasurement Services for Pearson Educational Measurement, based in Iowa City, Iowa, “you really have to perform comparability studies on every program. You can’t just say, ‘Oh, I tested math in Virginia, and it turned out to be comparable, therefore everything is comparable.’”
But Elliot of Vantage Learning suggests the concerns are off target. “When the car was introduced, can you imagine them conducting a car-horse comparability study?” he says. “It’s the wrong question. The car is coming. I’ve got news for you. If they’re not comparable, so be it.”
Not surprisingly, though, such concerns are leading some states to focus their initial online efforts on low-stakes exams—ones that aren’t used to rate schools or decide students’ futures.
Low-Stakes Testing Boom
In contrast to some of the reticence about giving high-stakes tests electronically, experts predict a boom in the use of technology for instructional and diagnostic purposes spurred by the new federal legislation.
“What I think No Child Left Behind may speed up is the development of low-stakes, computer-based tests that are designed to help kids do better on the high-stakes ones,” Bennett of the Educational Testing Service says. “I think that’s where we’re really going to see a lot of movement.”
In December 2002, Texas unveiled the first phase of the Texas Math Diagnostic System, a roughly $2 million, online practice assessment program for grades 4-8. This spring, the state is adding a 3,400-question item bank that teachers can use to come up with their own tests, quizzes, and homework assignments. And it is translating the test items in grades 4-6 into Spanish.
Texas also is working to add open-ended questions to the system, as well as an adaptive-testing component.
And in Florida, teachers can use the FCAT Explorer, a free, Web-based program that provides a series of test prompts and skills packages designed around the state’s academic-content standards, which guide the Florida Comprehensive Assessment Test.
“We are currently getting about 11.5 million hits a day,” says policy consultant Don Griesheimer, “and I think the company that’s hosting this has said we have about 4,000 simultaneous users a second. So it’s taken off big guns here in the state of Florida.”
Other states have offered schools access to commercially developed assessment-and-diagnostic packages through state Web portals. Indeed, the greatest activity is taking place at the district level, where companies are offering schools a range of products to help monitor student progress and provide teachers with instant feedback on what to do next.
This indirect market, generated by the No Child Left Behind Act, “I see as the real impetus for how technology will be used in schools,” says Michael T. Nesterak, the director of product management for CTB/McGraw-Hill, one of the country’s biggest commercial test-makers.
Despite such inroads, most observers agree that the potential power of technology to redesign assessment remains largely untapped.
“The powers of the computer have not been fully used yet in any of the state testing programs,” says Neil Kingston of Measured Progress, a Dover, N.H.- based company that has worked on online testing programs in Georgia and Utah.
Most computer-based programs at the state level remain exclusively multiple-choice. That’s not surprising, observers say. “The reality is we have to start where schools and districts are,” says John E. Laramy, the president of Riverside Publishing, one of the nation’s largest test companies. He believes that in time, “you will see companies introducing more innovative, interactive assessments.”
But with the federal law’s emphasis on providing more tests, at more grade levels, faster and more efficiently, no one expects that breakthrough anytime soon.
“No Child Left Behind might be pushing states,” says Doris Redfield, a technology expert at the Appalachian Education Laboratory in Charleston, W.Va., “but it’s pushing them in the direction that they’re already going of getting quicker results and ensuring that their assessments are aligned with their standards.
“Is it really pushing them to create assessments that are more true-to-life or out of the ordinary?” Redfield wonders. “I don’t know that it is.”
At least one observer, Ed Roeber, the vice president of external relations for Measured Progress, worries that the current emphasis on “cheaper, faster, better” is leading states to eliminate or reduce the number of open-ended responses in their testing programs.
After an initial burst of steam, experts admit, the expansion of computer-based testing at the state level is happening more slowly than they had expected.
“It’s kind of frustrating,” says Greg Nadeau, the director of the U.S. Open e-Learning Consortium, a collaboration among 14 states for sharing online educational tools. “The economic contraction is taking some of the wind out of the sails of the [information technology] industry and of state and local and federal projects.”
“It takes years to develop online testing,” Pogany of South Dakota says. “So you can’t come in and say: ‘OK. We have a new test. Put it online. Do it.’”
“Until I know that we have a solid and reliable test,” he adds of South Dakota’s new assessment system, “we’re not going to put kids at a computer. Now, will that happen next year? I don’t know. It will happen when we’re ready, and we’re not going to do it before.”
A version of this article appeared in the May 08, 2003 edition of Education Week