Setting high standards for graduation has become a mantra that will become an empty chant unless concrete, measurable standards are fully and effectively implemented in classrooms. The furor and confusion that surrounds the rankings of state standards documents is distracting everyone from the importance of implementing high-quality standards.
That is not to say that states don’t need to know how their standards stack up. They do. Vague, mushy standards that do not challenge students can be relied on to create much mischief and arguably are worse than no standards at all. The good (and bad) news is that targets set are likely to be met. A lot is riding on the quality of state standards; they must shoulder responsibility for improved curricula, modernized instruction, updated textbooks and assessments, enhanced staff and school accountability measures, and refined teacher preparation. But let’s be clear: These rankings evaluate documents (inputs), not student performance (results), and thus are only a fraction of the equation.
That being said, what do the appraisals conducted by the American Federation of Teachers, the Council for Basic Education, and the Thomas B. Fordham Foundation offer us?
While we argue that multiple analyses of state standards are better than no analyses, the grade differentials among the three reports are confounding--enough so to make state leaders either throw up their hands in utter bewilderment or embrace a high mark and ignore the others. Both responses threaten to defeat the very purpose of the reports. For example, Florida received a D from one appraiser and the equivalent of an A from another in mathematics. In both English and mathematics, Michigan received an F from one appraiser and a B-plus from another.
How is a state to make good use of the evaluations despite the tangle of scores? Here is what we found:
- States have a long way to go. While the CBE’s findings were decidedly more upbeat than those of the other appraisers, the three seem to share the common conviction that most states have a long way to go before their academic standards will be strong enough to serve as the foundation on which higher student achievement can rest.
- Excellent state models exist. While the specific list of exemplary models differs evaluation to evaluation, much agreement exists around the math standards from Arizona, North Carolina, Ohio, Utah, and West Virginia and the English standards from California, Massachusetts, and Virginia.
- Easy grading by one, tougher grading by others tells much of the story. Securing a high mark from Fordham was much harder to do than securing a high mark from the CBE, with the AFT rankings falling somewhere in between. Of course, as with most rules, exceptions exist: In five cases in English and three cases in math, the CBE gave the lowest mark.
- Getting past the grades pays dividends. Fun and tantalizing as grades are to read and compare (we plead guilty ourselves), accepting the grades without the accompanying analyses can easily lead states and standards writers astray. Only by understanding the evaluation rubrics, the procedures, and the underlying values (and biases) of the graders can average citizens and policymakers judge the judgments and know how to use the different analyses to guide future improvements.
- The devil is in the details. Each evaluator developed distinct instruments and applied different methods--it’s no surprise that the appraisals produced different results.
- The three reports graded different things. For clarity and specificity, check out the AFT’s evaluation; the AFT made no attempt to judge the overall quality or rigor of the content. On the other hand, Fordham and the CBE both made judgments (often contrary ones) about rigor. But Fordham graded many other features, including the clarity, specificity, measurability, organization, assumptions, and philosophy of standards, along with an examination of whether the standards included any negative qualities. The reports also differ in their scope: The AFT and Fordham reviewed all grades, whereas the CBE concentrated on two grade levels in math and two in English.
- The underlying values of the evaluations differ markedly. The negative criteria embedded in one analysis acted as positive criteria in another:
Best way to organize standards: The CBE states a preference for standards organized around grade clusters rather than standards set for each grade, while the AFT states a clear preference for the opposite.
Calculator use: Calculator use in the early levels of mathematical instruction was considered a big negative by Fordham; the CBE left technology out of its analysis because of disagreements within its advisory panels about how technology should be addressed.
NCTM standards: The CBE built its math evaluation directly on the National Council of Teachers of Mathematics’ standards; Fordham rejected the NCTM as a model of clear content and performance standards.
Relating standards to one’s personal life: The CBE rewarded states for requiring students to relate literary texts to their own lives; Fordham deducted points for the same.
High standards for all students: In order for a state to get credit for its standards, both the CBE and the AFT required that the standards define expectations for all students; Fordham included no such requirement.
Exemplary models: The CBE built a set of model standards approximated from national documents. On top of rejecting the national documents, Fordham resisted developing its own exemplary models and offered instead several existing state documents that matched the image of what excellent mathematics or English standards should contain.
Who reviews: Under the guidance of advisory councils made up of many of the regular education groups, the Council for Basic Education trained teams of teachers (for three days) to determine how faithfully the state captured the model CBE benchmarks. The Fordham Foundation rejected the “advisory panel-teacher teams” route in favor of securing just a few content experts for each discipline to summarize the basic strengths and weaknesses of state drafts.
- The evaluators themselves explain some of the discrepancies. In a Jan. 21, 1998, memo sent to a group of education reformers and opinion leaders, Fordham stated, "[Our] criteria for judgment were quite different from those employed by the writers of the CBE report.” In calling many of the CBE benchmarks “obscure or invalid,” Fordham appraisers concluded that a state scoring well by virtue of having standards written similarly to the CBE’s model standards necessarily would score badly by Fordham’s estimation, and “the reverse, no doubt.” Fordham also took the CBE process to task. The CBE states emphatically in its document that it searched for equivalent rigor in state documents, rather than equivalent language. Fordham reviewers challenge this claim as more wishful thinking than reality, asserting that states quoting verbatim from the NCTM were assured a high score.
We have reason to believe that as work on standards progresses, a true consensus about the quality of standards and the yardstick used to measure them will emerge. In the meantime, with less than perfect measures, here are seven things that a state can do to use the rankings to improve its standards:
- Read the accompanying analyses that came with the rankings. (We called several state departments, and most did not know the whereabouts of--or could not remember ever receiving--the private analysis that accompanied the public grades given by the CBE.) Fordham and the AFT made their analyses public; they accompanied the grades. Read yours and those of competing states.
- Value the criticism as much as the praise. No standards document is beyond improvement. Don’t lose sight of the fact that much depends on your standards: The better your product, the more likely you are to get results.
- Recall defective or half-baked standards documents. Act quickly to benchmark your documents against the best state and international models identified by the three reports. Include in the mix the standards from the top producers on national and international assessments. (The ones we like best are listed at our Web site www.goalline.org.) Finding out what the competition is doing is good practice.
Forward-thinking businesses dedicate whole divisions to tearing apart the most successful products on the market to gather clues on what makes them work. Their intention isn’t necessarily to reproduce every feature, but to determine how to get their products to measure up. Adopt that method to upgrade your standards. What is being asked of students in, say, California that is not being asked of your students? Does the level of sophistication asked of your students meet or exceed the level asked of students in the exemplars? Is the level of precision and clarity equivalent?
If you include standards that are not embodied in the model standards, check to make sure the standards in question add value, not flab. Are the “extras” useful, measurable, intelligible, or tied to important content? Benchmarking is painstaking work--a responsibility that takes days, even weeks to complete if done right. If you don’t have people on hand who know how to do this well, there are experts around the country who do this for a living. Hire them.
- You cannot hold too many conversations about the content of standards: Ownership matters. Gaining ownership is as messy as it is powerful. To have any real effect, standards must be incorporated into the life of the school: They must be embraced by the classroom teachers who must teach them, embraced by the students who must learn them, embraced by the parents who must support their children in learning them, and embraced by the business community and colleges who must make informed decisions about whom to invite into their ranks.
- Don’t fear going back to the drawing board now and in the years to come. Answering questions about the sufficiency and rigor of standards with certainty will take time. We will have to evaluate how our students are faring in international comparisons over time and be prepared to update standards accordingly. Moreover, disciplines continue to evolve and grow, so standards that are developed to reflect them must remain provisional enough to leave room for future developments in their respective fields. Without such revisions, standards will soon become artifacts of the age rather than enduring guideposts for improving instruction.
- Focus more on the results of learning--on student achievement. The true measure of whether or not standards are any good is whether or not kids are learning more. Use evidence on national and international assessments (and other indicators) to make your case.
Annual academic analyses are devices for education entities--and the policymakers who steer them--to get their priorities right, to challenge the status quo, and to serve as a reminder that the job can get done. They focus the research agendas of policymakers and assist school leaders in their decisions on where to invest intellectual and financial resources. Appropriately disaggregated data enable school leaders to understand who is and is not benefiting from your efforts; to understand the nature of the instructional processes that students are experiencing; and to understand the extent to which programs are meeting or failing to meet reasonable expectations.
- Invest as much time and energy on achieving standards as setting them. Without an accountability architecture, standards will be words on a page, not engines for reform.
Susan Pimentel and Leslye A. Arsht are the co-founders of StandardsWork at George Washington University’s school of education and human development in Washington. Ms. Pimentel is a co-author with Denis P. Doyle of Raising the Standard: An Eight-Step Action Guide for Schools and Communities (Corwin Press, 1997). Ms. Arsht is the president of the nonprofit Coalition for Goals 2000 Inc. and a former director of communications at the U.S. Department of Education.
A version of this article appeared in the November 11, 1998 edition of Education Week as Don’t Be Confused by the Rankings; Focus on the Results