A new study out of the University of Akron has found that automated grading programs can assess student essays as effectively as human readers can. The study essentially compared computer-generated ratings to those of human scorers on thousands of essays written by high school juniors and sophomores. The differences, the researchers concluded, were not significant.
“In terms of being able to replicate the mean [ratings] and standard deviation of human readers, the automated scoring engines did remarkably well,” the study’s lead author, Mark D. Shermis, told Inside Higher Ed.
While automated readers may improve efficiencies in grading papers and tests, however, Shermis warned against jumping to the conclusion that the technology could viably replace flesh-and-blood writing teachers. Essay-grading software should only be used “as a supplement to overworked [instructors of] entry-level writing courses, where students are really learning fundamental writing skills and can use all the feedback they can get,” he said.
Shermis also noted that automated graders cannot match humans’ capacity for gauging creativity.
A New York Times article, meanwhile, highlights further cautionary points raised by Les Perelman, a director of writing at the Massachusetts Institute of Technology. Perelman, who has extensively tested E.T.S.’ automated essay-grader, e-Rater, says the software can be easily fooled and is susceptible to teaching to the test.
The technology, Perelman says, can’t judge the accuracy of facts and—in defiance of George Orwell, Jaques Barzun, and Strunk and White—tends to prefer writing that is long-winded and grandiose. “Whenever possible, use a big word,” Perelman instructs mockingly. “‘Egregious’ is better than ‘bad.’”
The Times article notes that other makers of automated essay-grading programs turned down requests to let Perelman test their products. Which you might say is egregious, or eminently perspicacious, depending on your point of view.