Impact of Paper-and-Pencil, Online Testing Is Compared
How students perform on computer-delivered tests depends, in part, on how familiar they are with the technology, concludes a set of studies conducted by the Princeton, N.J.-based Educational Testing Service.
The studies looked at how students performed when given mathematics and writing items from the National Assessment of Educational Progress by paper and pencil vs. computer. The results of the studies were released this month by the National Center for Education Statistics, which oversees the federal testing program.
In the math study, nationally representative samples of 4th and 8th graders in 2001 took a computer-based math test and a test of computer facility, among other measures. In addition, at the 8th grade level, a randomly selected control group of students took a paper-based exam containing the same math items as the computer-based test.
Average scores for 8th graders taking the computerized test were about 4 points lower than for those taking the paper version, a statistically significant difference. On average, 5 percent more students responded to individual items correctly on paper than on a computer.
At both grade levels, students’ facility with the computer—based on hands-on measures of input speed and accuracy—predicted their performance on the online exam.
The writing study compared the performance of a nationally representative sample of 8th graders who took a computer-based writing test in 2002 with that of a second, nationally representative sample of 8th graders taking the same test on paper as part of the regular NAEP administration that year.
Results showed that average scores on the computer-based writing test generally were not significantly different from average scores on the paper-based exam. But, as with the math test, individual students with better hands-on computer skills tended to achieve higher online scores, after controlling for their level of paper writing skills.
Arnold A. Goldstein, the director of reporting and dissemination for the assessment division of the NCES, said that the findings suggest a possible problem in administering the national assessment online, but that further research is needed. “I think we would need to have a larger field test in a more traditional NAEP testing setting in order to determine that,” he said.
Mr. Goldstein added that, while this was a one-time study, the NCES—an arm of the U.S. Department of Education—may do further work in the future to explore the administration of the assessment online.
Scoring by Computer
The studies also examined the feasibility and costs of generating and scoring NAEP math and writing items by computer.
While the machine's grades on simple math items were generally interchangeable with those of human scorers, that was less true for items requiring extended text responses. On those items, the computer tended to treat correct responses that were misspelled as incorrect, a technical shortcoming that could be addressed by including common misspellings in the automated scoring key or including a spell-check before an answer is submitted, according to the study's authors.
On the writing test, however, automated scores did not agree closely enough with the scores awarded by human readers to consider the two types of scores interchangeable.
"That is something that needs to be considered in any further development work," said Mr. Goldstein. "For example, whether a trend can be extended from a paper-and-pencil to an online administration of the assessment."
Vol. 25, Issue 01, Page 14