Massachusetts students who took state exams online in 2015 scored significantly worse than their peers who took the same exams on paper, according to a new study by the American Institutes for Research.
The so-called “mode effect” was particularly pronounced in English/language arts, where the discrepancy amounted to nearly a full year of learning, AIR found. Lower-performing students, special education students, and English language learners suffered particularly sharp penalties when they took the ELA exams online.
In both ELA and math, however, the negative effects of taking the exams online diminished considerably during the second year of testing.
“To our knowledge, this is the first large-scale study of PARCC test-mode effects conducted with multiple years of data across an entire state,” according to the paper, titled “Is the Pen Mightier Than the Keyboard? The Effect of Online Testing on Measured Student Achievement,” and published in the February edition of the academic journal Economics of Education Review.
“We find that students administered an online exam score systematically lower than if they had taken the test on paper,” the researchers wrote.
That conclusion is generally consistent with previous studies and reporting, including a 2016Education Weekinvestigation in which PARCC officials acknowledged a general pattern of online test-takers scoring lower than paper-and-pencil test-takers during the 2014-15 administration of the exams.
In the years since, PARCC (short for Partnership for Assessment of Readiness for College and Careers) has changed its organizational structure and business model. Only a handful of states currently use the full PARCC exam. Instead, a number of states now mix test items developed internally with items leased from New Meridian, the nonprofit group that assumed control of the now-defunct multi-state PARCC consortium’s “item bank.”
Massachusetts used a small handful of the New Meridian test items in 2018. But as of 2019, that is no longer the case, according to state education officials.
A spokesman for New Meridian acknowledged the pronounced mode effects during the first two years of PARCC testing in Massachusetts, saying that state’s rollout of the new exams was particularly cumbersome. But more recent internal analyses of PARCC administration in other states have found that the mode effect has evaporated over time, he said.
“To the extent that it exists, it appears in the first year or two of a transition from paper to computer-based testing,” the New Meridian spokesman said in a statement.
“By 2018, there was no significant effect.”
Growing Familiarity With Computer-Based Exams
During the 2014-15 school year, roughly 5 million students across 10 states and the District of Columbia sat for the first official administration of the PARCC exams.
In 2015 and 2016, Massachusetts allowed schools to administer either PARCC or the state exam that had previously been in use, the MCAS, short for Massachusetts Comprehensive Assessment System. Districts who selected PARCC could also choose whether to administer that exam online or offline.
That flexibility led to at least seven different permutations of district testing plans over the two-year period.
In a statement, Bob Lee, the chief MCAS analyst for the Massachusetts Department of Elementary and Secondary Education, praised the AIR study as “excellent” and said state officials proactively took steps to make sure no one was punished for taking state exams via computer.
“Our assessment and accountability offices were aware that there were large differences, on average, between students taking online and paper versions of the test,” Lee wrote. “In anticipation of the difficult transition to online testing, our board had decided to not use PARCC results for school accountability, and our board had also decided to hold schools harmless for their participation if they took the PARCC test.”
Currently, Massachusetts uses a “next-generation” version of MCAS, designed to be administered via computer. This year, 98 percent of students in the state are expected to take the exam online, Lee said. The other 2 percent of students will use paper exams as an accommodation for a special need.
Nationwide, the transition to online testing has been driven by a number of perceived advantages, including more flexibility in designing test questions and performance tasks, faster scoring, reduced opportunities for cheating, and the chance to expose students to greater technology use.
It’s the last issue, however, that appears tied to the mode effect that often appears when the shift to computer-based testing is first made.
During the first year of online testing, experts say, students’ unfamiliarity with the devices and digital interfaces they’re using to take the exams appears to drive a large portion of the discrepancy in scores between online and offline test-takers.
In Massachusetts, for example, the AIR researchers found the penalty for online test-takers in the first year of administering the PARCC math exams was about .10 standard deviations, or the equivalent of a little more than five months of learning. The penalty for online test-takers in ELA was much higher, at .25 standard deviations, or 11 months of learning.
But in the second year of PARCC testing, those mode effects were much smaller: The penalty for online test-takers in math was about one-third the size of that found in 2015, while the penalty for online test-takers in ELA about one-half the size of that found in 2015.
Implications for Testing Policy
Still, the finding of significant mode effects in the early years of online testing has serious policy implications for states, districts, and schools, the AIR researchers argued.
Other states moving to the computer-based tests may want to consider Massachusetts’ “hold harmless” policy, they suggest. The idea is to ensure that lower scores that result from taking the tests online, rather than from any lack of knowledge or skill, don’t affect things like holding students back a grade, placing students into special education, teacher evaluations, or school accountability decisions.
“States or districts that administer PARCC online to some students and on paper to other students should be aware that the paper students will likely score systematically higher, even in the second year,” the study concludes.
“Policies that reward or sanction students, teachers, or schools based on student test scores should take test mode effects into account.”
Photo: A student at Marshall Simonds Middle School in Burlington, Mass., reviews a question on a PARCC practice test before 2014 field-testing of the computer-based assessments.--Gretchen Ertl for Education Week-File
A version of this news article first appeared in the Digital Education blog.