A particularly tough piece of the testing-consortium work is coming up: setting cutoff scores for each achievement level of the test. One of the two federally funded assessment consortia, Smarter Balanced, is using the public to help do that.
In addition to the work that will be done by panels of experts, Smarter Balanced is inviting anyone who’s interested—you don’t have to be a teacher or even work in education—to register to participate in the “standard-setting” process. The deadline for registrations is Sept. 19.
Participation will be via computer, and is supposed to take less than three hours over the course of two days in October. Reviewers will choose one grade level, and one subject (English/language arts or math) to work on. When they first log on, they’ll go through a brief training, and they’ll get acquainted with the consortium’s “achievement level descriptors,” which describe what student performance should look like at each of the four levels of the Smarter Balanced test (“thorough,” “adequate,” “partial,” or “minimal” understanding of the content). Then they’ll start reviewing test questions, arranged in order of difficulty.
The reviewers will get a mix of item types that will appear on the Smarter Balanced test: some multiple choice, some short answer, and a performance task. The answers will be supplied. In the case of short-answer and performance items, student responses of various levels of rigor will be offered to help reviewers sort out the weaker from the stronger answers.
Reviewers will focus only on one of the four achievement levels—Level 3, which is SBAC’s proficiency level. As they go through the item booklet online, they’ll be trying to figure out which of the items would represent the bottom rung of difficulty for Level 3 and which would represent the upper part of Level 3, according to consortium spokeswoman Jacqueline King. That way, the reviewers collectively begin to establish a range of responses that constitute Level 3, and they start to inform what the lower and upper score cutoffs might be.
The collected feedback from the participants will be considered by the experts who handle the next step of the process. The consortium considers the public input an important way to include the views of teachers and others as it figures out where to set the bar for each achievement level.
Then the next step of the standard-setting process begins: A panel of 500 experts, chosen by Smarter Balanced states, takes over. In mid-October, they will gather in Dallas to begin discussing test items and appropriate cutoff scores for each achievement level. Unlike the public reviewers, the expert panels will mull over upper and lower ranges for all four of the achievement levels. Separate panels will convene by grade level and subject (which is why they need such a big group of people to do it). So, for instance, while one panel discusses appropriate cut scores for 3rd grade English at each of the four levels, another will discuss cut scores for 3rd grade math.
Higher education representatives will have a role in this process, since a crucial part of the standard-setting is establishing a cut score for the 11th grade test that connotes readiness for entry-level, credit-bearing college work. Participants in the face-to-face meeting will also use cut scores from other assessments, such as the ACT, SAT, NAEP, PISA and TIMSS, to inform their work as they decide what reasonable scores would be for each level of the Smarter Balanced test.
As we’ve reported, Smarter Balanced embedded some NAEP and PISA items in its field tests, so it can see how students’ performance on those assessment items compares with their performance on consortium-designed items.
About 60 of the 500 expert panelists will then examine all the recommended cut scores in both subjects to make sure the system makes sense from grade to grade. The recommended cutoff scores will be put before the Smarter Balanced governing states for a vote later in the fall.
The other federally funded assessment consortium has a different timeline for its standard-setting. The Partnership for Assessment of Readiness for College and Careers, better known as PARCC, will not have a crowd-sourced element when it sets cutoff scores for the five achievement levels of its test (“distinguished,” “strong,” “moderate,” “partial,” or “minimal” understanding), according to a consortium spokesman. Its states will set cut scores in the spring of 2015, after the first operational test is given.
Ms. King, of Smarter Balanced, said that the consortium will review its cut scores after the first operational test in 2015, and make any “tweaks” that might be necessary. But since the consortium has student responses from 4.2 million field tests given last spring, it feels confident that it can set cut scores based on that data, she said.