Bucking the national craze for large-scale testing, one district focuses its efforts on the assessments teachers make in the classrooms.
First to go were the vocabulary quizzes. “I used to give them 15 words a week,” recalls Teresa Abrahams, an English teacher at Lincoln Northeast High School here. “It was hideous,” she shudders. “The quizzes I had designed didn’t actually tell me if they knew the words or not.”
Now, she assigns her students about eight vocabulary words a week. To ensure that the students understand the words, they must offer an analogy for two of them—along with a thorough explanation of their reasoning. On top of that, they have to define and provide antonyms for three words. And they have to write sentences for three more that make their meaning clear.
Such simple changes are part of a much more profound shift in how teachers in the 32,000-student Lincoln public schools use and think about classroom assessments.
Regardless of the form they take—weekly quizzes, end-of-semester tests, teacher questioning, comments and grades on homework, oral presentations, projects, or portfolios—classroom assessments are the most common, and some would say most ignored, kind of educational measurement.
While most public attention these days is riveted on the results of large-scale testing programs, research suggests that the classroom assessments teachers use day in and day out provide one of the most powerful tools available for improving student achievement.
“We have strong evidence that high-quality classroom assessment improves learning tremendously,” says Lorrie A. Shepard, the dean of the education college at the University of Colorado at Boulder, “possibly more effectively than any other sort of teaching intervention.”
Research has found that good classroom assessments can help teachers plan the next step in instruction, clarify the learning goals for students, and provide effective, frequent feedback about how to improve and actively engage youngsters in their own learning. Studies also have found that more demanding, intellectually challenging classroom assignments are linked with higher-quality student work.
The United Kingdom’s Assessment Reform Group calls such formative assessments ones primarily used to guide rather than sum up instruction “assessments for learning,” to distinguish them from “assessments of learning.” A review the group commissioned found that improved formative assessments typically yield gains in student achievement roughly equivalent to one to two grade levels in learning. Equally compelling, improved classroom assessments help low achievers the most, thereby reducing achievement gaps.
Research suggests that day-to-day classroom assessments provide one of the most powerful tools available for improving student achievement.
In contrast, state testing programs often fail to provide teachers with useful information.
“Large-scale-assessment programs are never going to provide teachers with the information they need, because by the time teachers get the information back, they’ve already moved on,” argues Carl D. “Nick” Novak, the director of evaluation for the Lincoln district. “When teachers need to make a decision, they need to make it right then. And they’ll make it on the basis of whatever information is available.”
Lincoln, the capital of Nebraska, is the quintessential Midwestern town, with wide streets, ample green spaces, and miles of blue horizon. Even after a decade of rapid growth, spanking-new subdivisions quickly give way to farmland on the city’s outskirts. It’s a place where people are still polite to strangers. Drivers yield to oncoming traffic. And school hallways are litter-free and sparkle with new paint.
It hardly seems the place for a revolution. But the concerted attention that Lincoln, and the state of Nebraska generally, have decided to focus on improving classroom assessment is in many ways revolutionary.
Back in 1993, the school system here began a periodic review of its assessment program. “We were not only heavily, but totally, reliant on norm-referenced, standardized achievement tests,” says Marilyn S. Moore, the associate superintendent for instruction. “If someone said to me, ‘How well are your kids doing in 7th grade math?’” she chuckles, “the only answer I could give them was to say the MAT [Metropolitan Achievement Test] scores in 7th grade were X.”
Coincidentally, the school system had been running a series of institutes for teachers about assessment. “It quickly became apparent how little we knew about classroom-assessment practices,” Novak says. “We had a very good group of teachers, but they were largely naive about assessment practices.”
Lincoln’s teachers were hardly alone. Currently, only 14 states explicitly require teachers to demonstrate competence in assessment to earn a teaching license, according to a report from the National Research Council. And only three states demand that principals demonstrate expertise in assessment.
In Nebraska, notes Leslie E. Lukin, an assessment specialist for the Lincoln schools, teachers, on average, might have had eight direct hours of instruction in assessment as part of their teacher-preparation programs. “The assessment part of teacher training is abysmal,” declares Moore. “Teachers don’t learn as undergraduates how to do quality assessment.”
Partly as a result, studies have found that teachers tend to devise tests that emphasize students’ factual knowledge or rote learning, and frequently mirror the multiple-choice and short-answer formats of their state’s large-scale testing programs. Marks and grades on assignments often fail to provide students with guidance about how to improve, while teachers’ questioning techniques too often limit students’ responses.
‘We have strong evidence that high-quality classroom assessment improves learning tremendously.’
To counter such practices, Lincoln’s 1993 assessment study made a series of recommendations that provided the catalyst for much of the work now in place. It advocated that the district set up a “more balanced assessment program” that relied less on norm-referenced tests. That suggestion led to a major effort by the district to craft its own tests, designed to measure its academic-content standards and written chiefly by classroom teachers. Many of the district’s tests take the form of classroom assessments that individual teachers give at the end of a unit, or when they perceive students are ready. The report also recommended that the district increase support for “the improvement of instructional assessment at the classroom and building levels.”
So in 1994, the Lincoln district applied for and won a three-year, $247,000 grant from the state to improve classroom-assessment practices. One of the primary vehicles the district has used since to enhance educators’ expertise is the Assessment Literacy Learning Team.
The concept, developed by Richard J. Stiggins, the president of the Assessment Training Institute in Portland, Ore., involves small groups of teachers, and ideally administrators, that meet eight to 12 times a year to read about assessment, design classroom-assessment materials and practices, try them out, and then come back together to share, analyze, and refine what the members have done. Each team member also completes assignments to begin building an assessment portfolio for classroom use.
The district piloted the first four learning teams in spring 2000. In the 2000-01 school year, it expanded the teams districtwide, on a voluntary basis. Five elementary schools, three middle schools, three of the district’s four high schools, and two alternative schools signed up. The district also continues to work with teachers on writing and phasing in districtwide criterion-referenced tests, which measure students’ knowledge of the district’s content standards.
“I think one reason classroom teachers have really liked the attention to classroom assessments is it’s real—it feels personal to them,” Moore says. “It is immediately useful, and it is directly tied to the curriculum.”
Before Wendi Herbin, a mathematics teacher at Lincoln Southeast High School, joined an assessment-literacy team, assessment was the last task she focused on in preparing a lesson.
“It used to be, ‘Oh, the quiz is the last thing you throw together,’” she muses.
“Now,” she says, “I write my quizzes before I teach. I’m very clear about what I want to teach before I start teaching.”
Those goals are also clearer to her students.
One recent day, Herbin begins a lesson for her advanced-algebra class by putting the day’s learning goal on the overhead projector—a regular part of her practice. Before launching into problem sets, she gives the students examples of how the math might be used to solve situations in daily life. Would a taller runner or a shorter runner have an advantage for speed? Would one skydiver or a group of skydivers in formation fall more quickly? “We’re going to actually attack those specific problems tomorrow,” she reassures them, “but we need to do some background work today.”
The 10-year veteran also gives students practice quizzes before every exam so they know “exactly what they’re going to be assessed on and how,” she says. “There are no secrets.”
‘Large-scale-assessment programs are never going to provide teachers with the information they need, because by the time teachers get the information back, they've already moved on.’
As part of such an open approach, students keep portfolios of their mathematics work that allow them to keep track of their grades. They also have a copy of the districtwide objectives for the course. Before each quiz or test, they can look at the objectives and conduct a quick self-check of which concepts and skills they have mastered and which need more attention.
Perhaps most welcome, from the students’ perspective, is that Herbin allows them to retake different forms of each test up to four times, as long as they provide evidence beforehand that they’ve actually reviewed the material.
“We have opportunities to retake all the tests and can work to get the grade that you want,” explains Brad Claussen, a junior. “It’s set up so you get the grade that you work for. There’s always room for improvement, if you want to take the time.”
Herbin helped pilot the district’s assessment-literacy teams, which she describes as “the best discussions I’ve ever had,” in large part because the experience was truly applicable to her work.
At Park Middle School, toward the city’s center, math teacher Todd Noble describes similarly small, but profound, changes in how he approaches assessment. “When I first started, I’d think, ‘Oh, I need to write a quiz. I guess I’ll pick about 20 questions; that’s a nice number. Five points each, no matter how hard they are.’ And then I’d write my quiz.”
“Now, before I write a quiz or a test, I think about what are the most important topics I should include. I make sure that I have some problem-solving questions. I make sure that there are some different levels of thinking going on.”
In the past, says the teacher of 11 years, Noble used tests strictly for grading. These days, he spends more time reviewing students’ answers, looking for patterns, and thinking about what he needs to reteach differently.
At Humann Elementary School, teachers also are engaged in systematic data- gathering to inform teaching and learning. On a recent afternoon, the 5th grade team at Humann meets with the principal and assistant principal to review a range of assessment results and decide how to adjust their teaching, move children from one reading group to another, and communicate students’ progress, or lack of it, to parents.
The meeting is centered on large spreadsheets of data that provide at-a-glance information about every 5th grader since the beginning of the school year. Teacher Jacki Sanks notes that the team has adjusted reading groups 10 times during the year in response to students’ needs. “The nice thing,” she says, “is we have the data to back it up.”
Unlike in most states, where Lincoln’s efforts would be bucking the tide of high-stakes, state-level testing, Nebraska’s commissioner of education is himself an ardent supporter of district- and classroom-level assessments.
Nebraska is the only state besides Iowa that has not mandated a uniform, statewide exam. Nebraska has a 4th grade writing test.
Commissioner Doug Christensen, a towering, blunt-spoken Midwesterner, describes state tests as “scorekeeping devices,” rather than tools that can help teachers in the classroom. “You take an athletic contest,” begins Christensen, a former high school coach who played football, basketball, and baseball in high school and college. “Just knowing the final score at the end of the game doesn’t tell you anything about how the game was played; it just tells you whether you won or lost. So coaches aren’t able to do much about their coaching based on scores. They do it based upon the data they collect on what happens during the game and during practices.”
The same, he argues, is true of instruction. It’s the day- to-day work in the classroom that counts, not the scores on once-a-year, state- level tests. Besides, he adds, “I’m opposed to absolutely anything that removes teacher judgment from decisionmaking.”
Commissioner Doug Christensen describes state tests as ‘scorekeeping devices,’ rather than tools that can help teachers in the classroom.
Under Nebraska’s STARS program—for School-based Teacher-led Assessment and Reporting System—each district is required to set learning goals or standards for what all students should know, or use the state’s model standards. Districts may use whatever measures they wish to gauge student achievement against those standards, including teachers’ classroom assessments.
To ensure quality, each district must submit an assessment portfolio to the state; the portfolio then undergoes an external review. In addition, districts must include one of five national, standardized tests in their assessment systems. That way, notes Christensen, “they can’t claim 90 percent of the kids are proficient in reading at the 4th grade, if their norm-referenced scores are at the 40th percentile.”
Districts must release their assessment results to the public. The state also publicly rates each district on the quality of its assessment program. Nebraska released its first state-level report card last year, based on a combination of district, state, and national assessments, including scores on the ACT, the college- admissions exam taken by 74 percent of Nebraska’s high school graduates, and the National Assessment of Educational Progress, the federal program that tests a sampling of students in key subjects.
“The way Nebraska has approached assessment has made it possible for school districts to do this,” says Philip H. Schoo, the superintendent of the Lincoln schools since 1985. “We’re not afraid of accountability, but we don’t think measuring it nationally with a single test, or even a statewide test, is going to accomplish it.”
The state has provided grants to help districts devise standards and assessments and offered assessment workshops to teachers and administrators. And it has trained a cadre of individuals who can educate others about assessment, through its work with Stiggins’ Assessment Training Institute. Meanwhile, state legislators appropriated $10 million to regional service units, in part to help small districts collaborate in meeting their assessment needs.
“We wanted a system where the quality of classroom assessment was so good that you could aggregate it up for accountability results,” says Christensen. “That’s our goal.”
Teresa Abrahams’ more immediate goal is to help students assume accountability for their own learning.
When students in her Advanced Placement literature class at Lincoln Northeast High turn in an essay, for example, they also turn in a form that asks them to comment on the essay’s content, organization, voice, diction, sentence fluency, and adherence to writing conventions. Abrahams then responds, in part, to students’ own perceptions about what they were trying to accomplish, what they did well, and where they missed the target.
“I think I’ve got a better idea of how I’m supposed to be analyzing my writing,” says Erik Owomoyela, a senior. “At the beginning of the year, I was like, ‘Voice? Diction? Sentence fluency? Aren’t those all the same thing?’ I have found myself thinking about it a lot more.”
To keep track of her students’ progress, Abrahams has purchased large, black three-ring binders that are kept on a wooden bookshelf at the back of the classroom. Each binder, which includes all of a student’s work, is partly a record-keeping tool, enabling a student to compute his or her grade at any time.
At the end of last semester, Abrahams, who’s been teaching 23 years, asked students to review their individual portfolios and complete a “learning survey.” Students had to identify what they had learned about literature and language, why they were learning it, and how they knew when they had actually learned what was expected of them. The students also had to explain what they did, or could do, when they didn’t understand something in class. Finally, they had to share their portfolio with an adult.
“I shared my answers with my supervisor at work, and it was kind of interesting,” says senior Christina Hall, “because then you had to review what you’d learned and realize you’d learned more than you thought.”
It's the day- to-day work in the classroom that counts, not the scores on once-a-year, state-level tests, proponents say.
Abrahams also asks students to write “author’s notes,” in which they reflect on the difficulties in their writing, as well as what’s easy for them. She’s noticed, for instance, that when students are grappling with challenging or unfamiliar ideas, their sentence fluency often breaks down. “That recursive process of writing means I can’t predict in a linear way where they’re going to need help.”
Research has found that actively engaging students in reflections upon their work—through self- or peer assessments—is one of the most critical features of high-quality formative assessments.
“You can’t make people learn,” notes Paul J. Black, an emeritus professor of science education at King’s College, in London, who co- wrote the review of studies on classroom assessment for the Assessment Reform Group. “You can only support, direct, encourage, and guide learning. So appraising where you are, where you need to be, and how to cross the gap requires self-assessment.”
That same kind of self-reflection is built into Wendi Herbin’s advanced algebra class. In fact, Herbin got the idea of using math portfolios from Abrahams.
Last semester, Herbin asked students to choose one section of their portfolio and write about it: What algebra topic did they show improvement on? What did they do to improve? What types of mistakes did they originally make and why were those wrong? Students also had to provide evidence of a homework assignment they had reworked at least once and a quiz or test they had retaken up to four times. And they had to reflect on how they would use the experience to help them learn in the future.
“I’ve never done this type of portfolio in a math class,” says Claussen, the Lincoln Southeast High junior. “In other classes, all I’ve done is turned in my homework, and you’re done.”
“It was stressful to put it together,” says junior Jill Schwarz, “but it was worth it. It’s good to see how you improved on something, especially on the test and retest.”
Even students in the district’s primary grades are developing their own criteria for what constitutes high-quality work and using it to critique their own work and that of others.
On a warm spring day in Pat Polly’s 3rd and 4th grade classroom, pairs of students are hunched over tables, sprawled on the sofa, or nestled into a big, welcoming armchair to read each other their picture books.
To review the books, the Cavett Elementary School students are using a rubric—a detailed description for identifying varying levels of performance—that they devised as a group. First, they worked with Polly to identify the features of a “satisfactory” picture book. Then, they worked in groups of three or four to identify the features of an “emerging” or “proficient” assignment. Finally, they came back together to agree on the criteria.
“We looked at the ‘S’ and decided what was better,” explains Alyssa Kloefkorn, a 4th grader. “One thing that was hard was that when you were in the groups, you had to agree.”
“It’s kind of fun working up your own grades, instead of the teacher grading your paper,” says classmate Jacob Jirovec. “If you know how to get a grade, you know what is expected of you.”
Research has shown that if students are given only marks or grades, they don't benefit from the feedback.
Now, Alyssa confers with a classmate about her own story—and she’s a ruthless judge. “That wasn’t really my best handwriting,” she sighs, “so I’d put ‘S.’”
Polly, who helped pilot the assessment-literacy teams for the district, has been teaching for 30 years, but says she’s still growing in terms of assessment. “Part of it, as a teacher, is giving up the control and letting the children become involved,” she says. “They’re very honest with themselves.”
Polly also shares the rubrics and other information from classroom and district-level assessments with parents to give them a clearer sense of what’s expected and how their children are performing. The district also has produced math and reading records for teachers to share with parents that, among other information, list the materials used in class, the authors studied, and students’ mastery of core learning objectives.
One of the challenges in getting students to engage in more formative assessments, however, is encouraging them to reveal what they don’t know. That’s hard in an environment often focused on grades and competition.
“Many kids don’t have the experience of sitting down with a teacher and conferencing about a paper and talking about what they’re really struggling with,” notes the University of Colorado’s Shepard. “That kind of interaction has to be fostered, even if you have to self consciously say, ‘I want this assignment not to be graded, or this version of an assignment not to be graded.’”
Indeed, research has shown that if students are given only marks or grades, they don’t benefit from the feedback. In contrast, comments focused on the criteria for high-quality work and what the student can do to improve are effective in raising performance.
That’s a hard message to swallow, particularly in the United States, notes Black of King’s College. “It’s been clear to me that the whole business of grades and points and competition is quite a larger business in the United States than it is in our schools,” he says. “So, yeah, you have problems.”
Though the Lincoln public schools have focused on improving district- and classroom-level assessments, the district also has seen gains on large-scale test scores, which officials attribute, at least in part, to their new assessment system. Between 1997 and 2001, the scores of Lincoln’s 3rd graders on the Metropolitan Achievement Test, a national norm-referenced exam, jumped from the 52nd to the 69th percentile in reading and from the 55th to the 81st percentile in mathematics. Sixth graders’ scores climbed from the 54th to the 62nd percentile in reading and the 53rd to the 71st percentile in math. MAT scores for 8th graders improved at a more modest rate, from the 57th to the 62nd percentile in reading and the 65th to the 71st percentile in math.
Even so, many of the district’s efforts, such as the learning teams, have yet to reach most teachers. And there’s been less work on certain aspects of formative assessment, such as teachers’ questioning and observation techniques.
Time is one of the biggest challenges. Park Middle School’s Noble notes that with five classes a day, it’s virtually impossible to devise good test specifications for every quiz or test. “If I had the time to build a good test,” he says, “it’s certainly worth it.”
Commissioner Christensen estimates that perhaps 75 of the state’s 535 districts have done a masterful job of combining classroom- based, district-developed, and nationally norm-referenced tests into an assessment system. With 400 districts having fewer than 500 students, districts are going to have to collaborate. And they are going to have to focus less on creating new, district-level tests and more on building upon the tests that teachers already use in their classes, he said.
‘One of my worries is that people are just going to take large-scale assessments and administer them in classrooms and assume that's what's meant by classroom assessment.’
The assessment-literacy teams also barely scratch the surface in terms of the knowledge teachers need about good formative assessment. To address that problem, three years ago, the University of Nebraska at Lincoln launched what it calls an Assessment Cohort, which provides teachers with a college-level course about classroom assessment.
The program consists of 18 hours of graduate-level courses for practicing teachers, spanning two summers, and a practicum during the school year. Both Abrahams and Herbin are part of the cohort. The program received an award this year from the National Council for Measurement in Education.
The state of Nebraska plans to provide teachers who successfully complete the program with a notation on their licenses that identifies them as experts in assessment. Lincoln officials also are working to establish a district-level certificate that would recognize teachers with assessment expertise.
But perhaps the biggest challenge is maintaining a steadfast focus on improving classroom assessment in the face of so much high-stakes, large-scale testing. The “No Child Left Behind” Act of 2001 requires states to administer annual reading and math tests to each student in grades 3-8 and at least once in high school.
“If the rules of the game are how you do on the state test—or on a national test, God forbid—then that’s what you play,” laments Christensen. “I think it’s very hard for the day-to-day activities of the classroom teacher to matter as much as it ought to matter when there’s a big game to be played later on.”
“One of my worries,” agrees Shepard, “is that people are just going to take large-scale assessments and administer them in classrooms and assume that’s what’s meant by classroom assessment.”
She recommends that states embed high-quality classroom assessments in the professional development teachers get on curriculum linked to state standards.
In addition, several prominent groups are calling for a better balance between classroom and large- scale testing. A report commissioned by five national education associations, “Building Tests to Support Instruction and Accountability,” advocates that states provide educators with optional classroom-assessment procedures that teachers can use to measure students’ progress in meeting content standards not measured by state tests. And a report by the National Research Council calls for shifting more attention and resources toward classroom assessments as part of establishing a comprehensive, coherent assessment system. Several organizations also are working with states and districts to help teachers improve their assessment practices. And earlier this month, a group representing 18 national education organizations approved standards for student evaluation.
“It’s really easy for the emphasis to be on standardized testing because that’s the most visible,” says Lukin, Lincoln’s assessment specialist. “That’s what grabs headlines. And when you have limited time and resources, it’s hard for that not to suck all the resources you have.”
But in Lincoln and Nebraska, at least for now, teachers know they have an ally.
“When you hear the commissioner of education saying over and over how much he values what teachers do, and the importance of classroom-based assessment,” says Abrahams, “then you think, ‘I’d better pay attention.’”
Coverage of research is underwritten in part by a grant from the Spencer Foundation.
A version of this article appeared in the May 22, 2002 edition of Education Week as Up Close and Personal