Opinion
Education Opinion

Grading Automated Essay Scoring Programs- Part II: Policy

By Justin Reich — April 17, 2012 6 min read
  • Save to favorites
  • Print

How could machines that automatically grade essays lead to Deeper Learning? On the face of it, the premise sounds preposterous. But I’m increasingly convinced that there is a potentially valuable policy strategy here, and this post provides an overview. But first, a review of what Automated Essay Scoring programs are.

Review of Part I: Automated Essay Score Predictors

My interest in exploring this subject came from a recent study showing that Automated Essay Score Programs had achieved the same level of reliability as human raters, and a subsequent conversation with Will Richardson and Vantage Learning.

Part I in this series examined the question: How do Automated Essay Scoring Programs work? One way to frame the answer to that question is this: “Automated Essay Scoring Programs” is a misleading name about what these tools do. It would be much better to call them “Automated Essay Score Predictors.” AES program are not very sophisticated at understanding the semantic (meaning-oriented) and syntactic (organization-oriented) elements of human writing. They are not capable of taking a piece of text and examining how that text fulfills the categories defined in a rubric. But if you give those machines access to examples of student writing that humans have already graded, they are incredibly good at predicting how humans would grade other essays. Basically, with some training, they are capable of predicting how humans would score an essay with a level of reliability that rivals the reliability between two humans scoring the same essay.

I’m not sure how the AES vendors would respond to making this distinction between “scoring” and “score prediction” (if you are reading this, let me know in the comments or email me!). But I think this distinction is quite helpful in understanding what these machines can and cannot do.

How Automated Essay Score Predictors could Incentivize Deeper Learning

Last week, Barbara Chow, the director of the education program at the Hewlett Foundation explained to a meeting of grantees why the foundation was investing in research concerning Automated Essay Score Predictors as part of their strategy of expanding opportunities for Deeper Learning in schools. (Disclosure: I run a Hewlett-funded research project, and Hewlett has indirectly paid me a salary for four years, though Harvard is my direct employer. That said, when I had a chance to speak for 15 minutes at the Grantee meeting, I devoted the entire time to explaining how their Open Educational Resources grantmaking program could potentially be expanding educational inequalities. So there is some evidence that I try to call it as I see it.) Again, it’s the kind of argument that raises eyebrows. “If we replace human essay raters with machines, students will have a richer learning experience.” Oh, really?

First point: there are two consortia (PARCC and SBAC) developing new tests for the Common Core Standards. In 2014 or 2015, we’re going to have some brand new tests in states all across the country. We have an opportunity to make them better. Here’s how Barbara makes the case that Automated Essay Score Predictors can do that

Here is an example of a test question from the AP US History test (2006 Released Exam):

Which of the following colonies required each community of 50 or more families to provide a teacher of reading and writing?

A. Pennsylvania
B. Massachusetts
C. Virginia
D. Maryland
E. Rhode Island

Now, this is the kind of question that makes most educators go berserk. A student can have a deep, rich understanding of early American history and not know that factoid. So what if we could replace questions like that, with questions like this (thanks to College Board for sharing):

By the early twentieth century, the United States had emerged as a world power. Historians have proposed various dates for the beginning of this process, including the three listed below. Choose one of the three dates below or choose one of your own, and write a paragraph explaining why this date best marks the beginning of the United States’ emergence as a world power. Write a second paragraph explaining why you did not choose the other dates. Support your argument with appropriate evidence.


  • 1898 (Spanish-American War)
  • 1917 (Entry into the First World War)
  • 1941 (Entry into the Second World War)

I have some quibbles, but this is a much, much better question. The question calls upon several skills broadly identified with deeper learning: solving an ill structured problem—one without a correct answer and requiring tacit knowledge—and communicating that answer in a persuasive, evidence-based argument.

Nearly everyone would agree that question 2 is better than question 1. Cost is the the main reason we ask multiple-choice questions over having students write open responses. It’s expensive to train, hire, and evaluate armies of raters to read student work. In theory, AES programs change the policy dynamic in three ways. First: it becomes much cheaper to score essays. Second, since you need fewer humans to score essays, you can train those humans better (this addresses the comments raised by @ceolaf in Part I). You can ask more complex questions because you can have better trained people doing more sophisticated rating. Third, since they are cheaper to score, test designers can include more (and more sophisticated) essay questions and fewer multiple choice questions.

So the Hewlett bet is that if you can use technology to allow the tests to ask more complex questions, we’ll get more writing in schools, teachers will be forced to create richer learning environments to prepare students for more complex questions, and students will have more opportunities in schools for deeper learning.

Lots could go wrong here. Systems could keep the same tests, use AES programs, and take every penny saved and spend it on the rising cost of healthcare. Test designers could write dumb essay questions or dumb rubrics. (But they are not forced to do so by the technology; AES programs can predict scores on sophisticated questions or source-based questions as well as they can with simpler questions; the limiting reagent is the capacity of humans to agree on scores with a nuanced rubric, not the limit of the technology. Similarly, students can game the rubric, but they can’t game the AES programs. Students might be able to game a human scoring a rubric, but students can’t game a program evaluating the frequency of co-located stemmed words. The key limitation in the system is human training (which is @ceolaf’s point).)

But for all that, I think it’s entirely plausible that by leveraging the power of computer programs to instantly and inexpensively predict essay scores, we can create more sophisticated assessments and have those better assessments drive better classroom practices. The worst case scenario that I envision is that even though we make kids do more writing, it’s still stupid writing. In our current policy context, having kids write more is a downside I’m willing to risk.

This argument is not in conflict with the argument that we have too much testing, which I also believe. We should have less testing, and the testing we have should be better. It is worth experimenting with whether AES predictors can help with the latter issue.

One last point, I’m thrilled that CMU submitted an open source score predictor in the contest, and doubly thrilled that it does so well. I really hope that the consortia look very seriously at the tremendous advantages of using a scoring mechanism that scholars and data analysts around the world can collaborate on and improve collectively.

There is lots more to talk about here, and plenty to debate and discuss. I hope if you have questions or are enraged by something, you’ll leave a note in the comments. In my next and last post in the series, I’ll talk about how Automated Essay Score predictors could be used by teachers and students in classroom settings for teaching and learning about writing.

For regular updates, follow me on Twitter at @bjfr and for my papers, presentations and so forth, visit EdTechResearcher.

Related Tags:

The opinions expressed in EdTech Researcher are strictly those of the author(s) and do not reflect the opinions or endorsement of Editorial Projects in Education, or any of its publications.

Events

Classroom Technology Webinar How Pandemic Tech Is (and Is Not) Transforming K-12 Schools
The COVID-19 pandemic—and the resulting rise in virtual learning and big investments in digital learning tools— helped educators propel their technology skills to the next level. Teachers have become more adept at using learning management
School & District Management Live Online Discussion Principal Overload: How to Manage Anxiety, Stress, and Tough Decisions
According to recent surveys, more than 40 percent of principals are considering leaving their jobs. With the pandemic, running a school building has become even more complicated, and principals' workloads continue to grow. f we
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
Student Well-Being Webinar
Building Teacher Capacity for Social-Emotional Learning
Set goals that support adult well-being and social-emotional learning: register today!


Content provided by Panorama

EdWeek Top School Jobs

Teacher Jobs
Search over ten thousand teaching jobs nationwide — elementary, middle, high school and more.
View Jobs
Principal Jobs
Find hundreds of jobs for principals, assistant principals, and other school leadership roles.
View Jobs
Administrator Jobs
Over a thousand district-level jobs: superintendents, directors, more.
View Jobs
Support Staff Jobs
Search thousands of jobs, from paraprofessionals to counselors and more.
View Jobs

Read Next

Education Gunman in 2018 Parkland School Massacre Pleads Guilty
A jury will decide whether Nikolas Cruz will be executed for one of the nation’s deadliest school shootings.
3 min read
Annika Dworet and her husband, Mitch Dworet, wipe away tears as their son's name is read aloud during Marjory Stoneman Douglas High School shooter Nikolas Cruz's guilty plea on all 17 counts of premeditated murder and 17 counts of attempted murder in the 2018 shootings, at the Broward County Courthouse in Fort Lauderdale, Fla. on Wednesday, Oct. 20, 2021. The Dworet's son, Nicholas Dworet, 17, was killed in the massacre.
Annika Dworet and her husband, Mitch Dworet, wipe away tears as their son's name is read aloud during Marjory Stoneman Douglas High School shooter Nikolas Cruz's guilty plea on all 17 counts of premeditated murder and 17 counts of attempted murder in the 2018 shootings, at the Broward County Courthouse in Fort Lauderdale, Fla. on Wednesday, Oct. 20, 2021. The Dworet's son, Nicholas Dworet, 17, was killed in the massacre.
Amy Beth Bennett/South Florida Sun Sentinel via AP
Education Briefly Stated: October 20, 2021
Here's a look at some recent Education Week articles you may have missed.
8 min read
Education Gunman in Parkland School Massacre to Plead Guilty
The gunman who killed 14 students and three staff members at a Florida high school will plead guilty to their murders, his attorneys said.
4 min read
Parkland school shooter Nikolas Cruz is sworn in before pleading guilty, Friday, Oct. 15, 2021, at the Broward County Courthouse in Fort Lauderdale, Fla., on all four criminal counts stemming from his attack on a Broward County jail guard in November 2018, Cruz's lawyers said Friday that he plans to plead guilty to the 2018 massacre at a Parkland high school.
Parkland school shooter Nikolas Cruz is sworn in before pleading guilty, Friday, Oct. 15, 2021, at the Broward County Courthouse in Fort Lauderdale, Fla., on all four criminal counts stemming from his attack on a Broward County jail guard in November 2018, Cruz's lawyers said Friday that he plans to plead guilty to the 2018 massacre at a Parkland high school.
Amy Beth Bennett/South Florida Sun Sentinel via AP
Education California Makes Ethnic Studies a High School Requirement
California is among the first in the nation to require students to take a course in ethnic studies to get a diploma starting in 2029-30.
4 min read
FILE - In this Jan. 22, 2020, file photo, Democratic Assembly members, from left, James Ramos, Chris Holden Jose Medina, and Rudy Salas, Jr., right, huddle during an Assembly session in Sacramento, Calif. Medina's bill to make ethnic studies a high school requirement was signed into law by California Gov. Gavin Newsom on Friday, Oct. 8, 2021. (AP Photo/Rich Pedroncelli, File)