Essay Grading Goes Digital
But critics question the use of software to assess writing
It didn’t take long for Pat Thornton’s 7th graders to figure out how to outsmart the computer. They even coined a term for it: “schmoozing.”
“If you write a schmoozy essay,” one student told Thornton soon after she began using a Web-based essay-grading program, “the computer gives you a good grade.”
The Irvine, Calif., students discovered that if they just included predictable words, phrases, or features in their paper, the computer would view it favorably regardless of the quality of the work.
Just as quickly, though, the young scribes realized that while Thornton relies on artificial intelligence to read and evaluate some classwork, the 32-year veteran English teacher is still watching.
“They can fool the computer, but they can’t fool the teacher as easily,” says Thornton, who teaches at the 620- student Lakeside Middle School.
The Criterion Online Writing Evaluation Program, a product of the Educational Testing Service in Princeton, N.J., helps Thornton supplement her writing instruction and monitor students’ work without adding to her grading burden.
Several software and online programs developed over the past five years promise to make it easier to incorporate more writing exercises for students without piling more work on the teacher.
“This gives students opportunities to practice their writing and get immediate feedback,” Thornton says of the computer program. “It’s not the same type of feedback that they get from a teacher, ... but these are not portfolio pieces, these are practice.”
As the technology has been refined for more practical use in middle and high school classrooms, more states and districts are examining how such programs can bolster writing programs or streamline testing in the subject.
But some experts worry that the products’ capabilities are overstated, and they warn that the potential for misuse is great.
“What worries me is not that these tools are electronic, nor that they provide some assistance with the labor-intensive task of responding to student writing, but that they will be seen by some as making unnecessary the professional knowledge that teachers need to effectively respond to student writing,” says David M. Bloome, the president of the National Council of Teachers of English, based in Urbana, Ill.
“Some will mistakenly believe that the software can do it all.”
As teachers juggle the increasing demands on the curriculum and the school day, many have had to restrict time for students to learn and practice writing skills, many scholars say. Few teachers, they say, can devote the hours necessary to read and grade such assignments.
And with the multitude of resources available on the Internet and students’ penchant for copying indiscriminately from them, many teachers now dismiss essays and research papers as little more than an exercise in cheating.
As a result, according to at least one recent study, middle and high school students are encountering fewer and fewer writing requirements in school just as the business world is demanding more such skills from job candidates. Commissioned by The Concord Review, a journal of high school historical essays, “The History Research Paper Study” found that most students never encounter extended writing assignments in school.
Thornton, for one, has not wavered in her focus on writing over the years. California’s Irvine Unified School District, where she teaches, pushes writing across the curriculum for all students, who have come to expect regular essay assignments.
But with more than 180 students in her 7th and 8th grade language arts classes, the teacher can spend upwards of 60 hours grading a single assignment. So when the 23,000-student district signed on to the Criterion system in 2001, it was welcome relief.
In the past, Thornton has asked students to keep weekly essays they’ve written in a personal folder, where she could review them at any time. Students selected their favorites at the end of the semester, when the teacher would sit down with her red pen and evaluate them critically.
With Criterion, she says, students respond to a question, or prompt, that requires them to write a descriptive or persuasive essay on a given topic. They are evaluated within seconds on everything from grammar and spelling to style and organization. They can revise their work as many times as they want.
The feedback produced from those practice sessions allows the teacher to view the work of individual students or entire classes in ways that have been impractical in the past. Thornton can, for example, read a submission, scan the computer’s evaluation, then add her own comments into the file.
She can also have the Web-based tool sort students’ work based on the sophistication of the writing.
In doing so recently, Thornton saw that 8th graders in one class were relying far too much on simple sentences, even after her extended lessons on using compound and complex constructions. She reviewed the material with her students again and saw improvement in their subsequent submissions.
In another class, the program prompted a student to dig more deeply into his vocabulary knowledge to revise an essay after the computer analysis showed he used the same words over and over.
Human vs. Computer
Computer-based evaluation of writing assignments is a relatively recent innovation, but one that has evolved rapidly in its practical applications, as well as its status among some teachers and administrators.
The programs have already been pressed into service in higher education. The scorer designed by the ETS, for example, is used to grade the essay portions of the Graduate Management Admission Test. The Vantage Learning scorers, VLP and IntelliMetric, are used to assess the writing skills of entering college students.
But there are no plans to use such software to score the written portions of the SAT college-entrance exam or the written sections on Advanced Placement tests.
When the first researchers to develop computer-assisted essay grading unveiled their software projects in 1998, much of the reaction was skeptical.
Educators said they doubted that technology could be used to evaluate submissions that are inherently unique for each student. Many observers scoffed at the idea that creative concepts expressed in writing could be fairly or effectively evaluated by a machine.
Yet developers of essay-grading programs tout their accuracy and consistency in evaluating papers by students in 4th grade through graduate school.
They say the development process works like this: Once essay questions and a scoring “rubric” are crafted, and hundreds of sample responses are collected, specially trained educators or professional readers evaluate them. The grading data are entered into the computer, and a scoring model is created.
“These systems are scoring as accurately as [human] readers, at a tiny fraction of the cost,” says Richard Swartz, the executive director of the performance-modeling and scoring division of the Educational Testing Service.
The technology, he contends, is less prone to error or other human factors that can affect grading. The computer never has a bad day, Swartz says.
Officials in several states have taken note and begun to pilot such programs.
Indiana, for example, tested the ETS scoring technology used for the Criterion program last year. The testmaker’s E-Rater and C-Rater products were used to evaluate more than 130,000 essays and short answers submitted by more than 22,000 11th graders on a state assessment.
Criterion Online is being used in more than 300 schools around the country.
Pennsylvania has conducted three pilot programs for evaluating reading and writing assessments of students in grades 6, 9, and 11, using the Intellimetric essay-scoring system developed by the Yardley, Pa.-based Vantage Learning.
Officials in Massachusetts, Ohio, and Oregon have also begun to explore the potential for using the technology on a large scale.
Can Computers Think?
Not everyone, though, is convinced of its ultimate benefit to instruction. The technology is still a target of criticism among many educators committed to the power of human expression through writing.
Bloome of the National Council of Teachers of English worries that the context and audience for students’ writing will become as artificial as the medium. Students, in effect, will begin composing for the computer and its prescribed criteria for sound writing.
“One of the things that we hope that students will learn as they learn about writing is how to use all of the resources that are involved in our language system to express meanings, to be creative, to express emotion, to be persuasive,” says Bloome, a professor of education at Vanderbilt University, in Nashville, Tenn. “While this may include the so-called foundations, ... we need to be careful that any of those tools we use as teachers does not work against our pedagogical goals.”
The developers of the programs acknowledge the technology has artistic limitations. Criterion Online was intended to provide students with writing practice and feedback that wouldn’t otherwise be practical in most classrooms, says Swartz. It is not intended to take the place of good instruction or exercises that foster creative writing.
“The ideal use of the technology is to let the computer do the dumb stuff, and let the teacher focus on the parts that require the thought and judgment of an experienced teacher,” Swartz says. “This computer can’t read or think, and it can’t evaluate the quality of ideas like a teacher can.”
Vol. 22, Issue 35, Pages 39-40, 42Published in Print: May 8, 2003, as Essay Grading Goes Digital