It’s very unlikely that the flattening of student test scores in states using the Smarter Balanced assessment in 2016-17 is due to technical problems with the exam, the group’s leadership concluded in a new report released Wednesday.
In February, Curriculum Matters brought you the details of a debate about the meaning of the largely flat scores on the exam, given that year in 13 states.
At the time, some critics questioned whether those results genuinely reflected a stall in student achievement, or whether it pointed to serious issues in the test. Perhaps, they said, Smarter Balanced didn’t include enough test questions to measure the full range of student skills.
Smarter Balanced Assessment Consortium officials promised to examine this question in a series of technical analyses.
Essentially, consortium wonks took samples of results from the states that administered the exam and analyzed them to see whether they lined up with expected scores. They also looked at how the test items “performed"—whether they were as easy or hard as field tests indicated.
They’ve now released their results of their analyses. Warning: the 30-page report is pretty technical. Have a testing expert on speed-dial if you decide to brave it. (SBAC also put out a much shorter summary.)
Here are the most important takeaways:
- Mean test scores were down in most grades for English/language arts, and most steeply in 5th grade English/language arts. In math, they fell for all secondary grades, but rose for the primary grades. Overall, however, these are very small changes, and it’s not clear how meaningful they are in terms of measuring learning.
- SBAC enlarged the pool of computer-scored test questions by half in ELA, and about a third in math, for the 2016-17 administration.
- These new questions were generally easier than the “old” questions, which contradicts the theory that scores were flat because the test had too many difficult questions. Importantly, because the test adapts to students’ achievement levels, having more easy questions in the pool doesn’t necessarily mean that students got an “easier” test overall.
- Students did tend to do slightly less well on these new test questions than on older test items. These differences were small overall, though, and don’t seem to explain the bigger-picture achievement patterns. Grades and subjects with the highest performance differences between old and new test items didn’t neatly line up with the grades and subjects that saw mean score declines.
- Student performance differed by test question type. In math, students tended to do better on computer-scored items than on the “performance tasks"—multi-stage problems worth several points and scored by hand—and in ELA, they did better on performance tasks than on the computer-scored questions.
Looking across this data, SBAC officials conclude that the test was functioning as it was designed to do, and that the 2016-17 results are an accurate description of what students in those states knew and could do when they took the test.
In addition, an outside reviewer from the Center for Assessment, a New Hampshire-based test-consulting group, largely agreed with the SBAC analysis.
So just how are we supposed to interpret the flat scores for 2016-17? That’s trickier.
The data suggests that some of the early gains on the test, in the 2014-15 and 2015-16 years, were probably partly the result of students simply becoming more familiar with the exam format. This is essentially the “plateau effect,” a term testing experts use to describe the initial boost followed by a drop-off in scores.
But the strong performance on the ELA performance tasks could also indicate more teaching in alignment with the Common Core’s focus on citing evidence in reading and writing.
SBAC said it will establish “additional procedures to monitor and potentially discriminate better between growth due to increased familiarity with the test and true improvement in student learning.”
A version of this news article first appeared in the Curriculum Matters blog.