Can Performance Assessment Survive Success?
The good news is that performance assessment has succeeded: Policymakers from Maine to California and Washington to Florida have discovered it. The list seems to grow daily--Arizona, Arkansas, Alaska, Connecticut, Kentucky, New York, and Vermont are only some of the states that have committed to performance assessment as a means of evaluating student performance or educational programs. The number of school districts turning to performance assessments grows daily. Foundations have given substantial funds to performance assessment. The New Standards Project is working with 17 states and six districts to develop portfolios, projects, and performances as the centerpiece of a national examination system. The most recent indicator of success is the publication of the report of the National Council on Education Standards and Testing, "Raising Standards for American Education,'' which calls for a new system of standards and system of assessments.
The bad news is that performance assessment has succeeded: Policymakers from Maine to California and Washington to Florida have discovered it. Until now, performance assessment has been nurtured, like tender shoots, in a greenhouse. The work being done by Project Zero and ARTS PROPEL, the imaginative approach to assessment that has emerged in the
Pittsburgh Public Schools, the slow and deliberate emergence of mathematics portfolios in California, and the development of science and mathematics assessment in Connecticut have taken place in sheltered environments, out of the public eye and the policy arena--stakes have not been attached to performance. No longer. In Kentucky, salaries and jobs are linked to the achievement of students, which is to be measured using performance assessment. However "voluntary'' the national examination system proposed in "Raising Standards,'' states will be compared and there will be consequences for students and teachers.
Until now, performance assessment has been successful because it is attractive to educators: It provides students with challenging and engaging assessment tasks; it provides teachers with "tests'' to which they like to teach; and it provides information about student performance that transcends skills, grade levels, stanines, and percentiles. These are assessments designed to test for deeper understanding in and across subjects and to encourage student reflection on performance.
Performance assessment is attractive to policymakers because it provides information on achievement more in keeping with the needs of a complex world and economy. Accountability is no longer tied to assessing the basic skills that served as the foundation for the first wave of reforms and the testing that went with it.
The perspectives of these two groups are not necessarily in conflict, but successfully grafting the two will be a demanding challenge to the ingenuity of educators and to the educational commitment of policymakers. The fact that both educators and policymakers recognize the importance of assessing authentic achievement at least gives us a place to begin.
What kind of commitment will it take to allow performance assessment to succeed in the current climate? First, we need to be honest. The discussion is not about standards and assessment, it is about systemic change. What we are really asking is that teachers learn a new curriculum, change the way they teach, and assess student results differently--all in the public eye. Given the way we have prepared our teachers, the way schools are structured, the curriculum as it exists, and the support teachers typically receive, this effort will fail, and no standards or assessments, regardless of how exciting they are, will alter this fact. All of these things must be changed if teachers are to be given the opportunity to teach to the new standards and students the opportunity to perform on the assessments. More and more business leaders are learning what it is to restructure--it is hard work, it takes a systems perspective, an unparalleled commitment to real staff education and leadership.
Second, educators must agree that the establishment of standards and the assessment of performance against those standards is the fundamental seed of reform. Educators need to agree on what we mean by performance and be able to show people--students, parents, business leaders, the public, and policymakers--what constitutes outstanding performance and what is mediocre. We must have compelling alternatives to percentiles, stanines, and grade levels. In Vermont, committees of teachers set extremely high standards, and demonstrated their value to their colleagues, legislative leaders, and the business community. Equally important, educators must internalize these standards, understand them implicitly, and make them ours--and then, we need to help others, most notably our students, take ownership as well.
Third, policymakers must understand that they too will be held accountable for ensuring that students have the opportunity to perform. Last year Vermont piloted writing and mathematics assessment in a sample of schools in the 4th and 8th grades using portfolios. Over all, the year was a striking success. We learned a great deal about portfolio assessment. We found out that teachers could accurately evaluate student work using portfolios. When the media reported the results, the focus was not on these positive accomplishments, or the strong writing performance of students indicated by the pilot assessment. Instead, the story was that Vermont students "failed'' at mathematics. Based on the results, committees of teachers created professional-development materials and programs to assist their colleagues in the first full year of implementation. In reality, they established a college of continuing education focusing on writing and mathematics open to all Vermont teachers. To their credit, Vermont policymakers did not rise to the bait offered by the media. They knew they were in it for the long haul and had a commitment to making the program a success.
In "Raising Standards,'' the National Council recommends that validity and reliability be demonstrated before a test can be used for high stakes. That is not enough. The report contains the right words. "Opportunity to perform,'' "fairness,'' "reforming schools,'' "professional development,'' "engaging families and communities,'' and others, are all there, but the passion in the report focuses on student-performance standards and assessment. The report needs to go much further. School-delivery standards and system-delivery standards must be part of any reporting of assessment results. No state or district should be allowed to participate in the national examination system unless it has developed and published these comparative standards. The question is, have educators and policymakers equipped schools and students with the instruments they need to succeed?
State-by-state comparisons will draw the attention of policymakers and the public, but those used for federal monitoring must complement, not contradict, the performance assessments being developed in the states and districts now. The National Assessment of Educational Progress must change rapidly if it is to play that role. The recent mathematics standard-setting process for NAEP is not reassuring on this score.
Fourth, the friends of performance assessment must understand that the tilt of the field has changed. Until now, performance assessment has been winning without having had to perform as a vehicle of assessment in an accountability system, and while performance assessment is not a "star wars'' technology, as some have likened it, even its friends have been asking probing questions. At last year's American Educational Research Association meeting, one invited symposium was entitled "Performance Assessment--Myth and Reality,'' while at the annual Education Commission of the States/Colorado assessment meeting, a panel was devoted to "Performance Assessment--Flagship, Fad, or Fraud.''
Before the stakes grew it was possible to slough off the tough questions relating to consequences, equity, or reliability by pointing out that performance assessment was good because we were assessing real writing or real mathematics: This kind of assessment would lead to better instruction. This is no longer enough. We all need to agree that there are legitimate questions to be resolved as we proceed. Among them: how generalizable are results? Can we be sure, from one day to the next, or one rater to the next, that the inferences we would make are the same? What can we infer about the subject of mathematics, or even one of its domains, from performance assessment? In Vermont, we have started the studies that will provide some answers. California and Connecticut have been hard at work on these issues as well. Nationally the questions are tougher and so is the environment for studying the results. In a different climate we could do evaluation work, improve approaches, let a hundred flowers bloom, picking the best where we found them. That luxury is gone. We need to work with the psychometrician to define the questions.
Can performance assessment survive success? Perhaps, given the right circumstances, but only perhaps. When performance assessment moved into the high-stakes arena we lost our "license to fail.'' Yet, knowing the perversions that high-stakes standardized testing has visited on curriculum and instruction--on students and teachers--it is unimaginable that we might assess progress toward our national goals using anything but performance assessment.
W. Ross Brewer is director of planning and policy development for
the Vermont Department of Education