There's No Such Thing As Grade Inflation
It's that time of year again for teachers. As an elementary school teacher for five years, I hated it. You spend a year getting to know your students, getting to know their strengths and weaknesses, becoming friends with their parents... then you have to go and give a failing grade to one of them.
I admit it. Sometimes I just couldn't do it. A well-deserved F would sometimes become a D- as the time came to actually turn in the final grades. Throw out the lowest quiz grade, give a little extra credit, and I discovered that the student's performance wasn't so bad after all. Eventually, I even realized that I had somewhat of an affinity for the plus sign. I always thought that a B+ just looked better than a B- so I started giving more pluses and fewer minuses.
I had heard about the concept of grade inflation back then, but then the issue of grade inflation seems to come up about every five years or so. For example, over the past year or two, stories have appeared in this newspaper about Adele Jones, the Delaware high school teacher who was fired because she actually gave D's and F's in her algebra class, and about grade inflation in universities. (See Education Week, Sept. 15, 1993, and March 8, 1995.) One of the illustrative cases involves Stanford University, where it was discovered that 90 percent of the grades awarded are A's or B's. It is apparently a big deal that Stanford has developed a new policy that will permit the use of the grade F for the first time in 25 years. After some initial fuss when stories like these come to light, it seems that things die down and we get back to business as usual.
In my own career, I now find myself at the other end of the educational ladder, teaching at a university. I've become somewhat involved in research on matters of assessment and grading, and I've discovered that there really is no such thing as grade inflation. Inflation occurs when, for example, it takes $10 to buy something that used to cost $5, it takes $20 to buy something that used to cost $10, and so on. The problem with grades can't accurately be called inflation because there is no assessment equivalent to a $20 bill: Once you hit the end of the scale--the A grade--then you're stuck; there's nothing higher. So, a D might become a C, and a C might inflate to a B, but in the end it all has to stop at the A. A more accurate description of this phenomenon might be grade compression.
Of course, this phenomenon doesn't just apply to A's and B's. There are all kinds of marking systems in use: numbers, symbols, letters, or descriptors like "emerging," "developing," and "maturing." Whatever the surface differences, though, they are still primitive and abused tools for doing the job of communicating achievement information.
Research into teachers' grading practices is discouraging, although it does illuminate some of the reasons for grade compression. Evidence from a recent study provides some information about what goes into the grading decisions that teachers make. In the study, we asked over 100 teachers about their grading practices. The findings revealed great differences in what teachers actually do, and great uncertainty about what should be done. For example, when asked to indicate what factors they consider when assigning marks to assignments and tests, 83 percent indicated that they considered the percent or number correct on the assignment; however, from one-third to one-half of the teachers also said they considered things like the difficulty of the assignment, how the class performed overall, the individual students' ability levels, and how much effort a student put into the work.
When assigning final grades, the marks that had been given to individual assignments and tests--that uncertain mix described above--were combined with three other kinds of information: 1) formal achievement-related measures (for example, attendance, class participation, extra-credit work); 2) informal achievement-related measures (such as answers to in-class questions, one-on-one discussions); and 3) other informal information (impressions of effort, conduct, teamwork, leadership, and so on).
The conclusion? It seems that nearly everything is considered when assigning a grade. I call this the "kitchen sink" phenomenon: Everything is thrown into the grade including the kitchen sink. There are several reasons for the kitchen-sink phenomenon. One is that teachers naturally want to consider any relevant aspect of a student's classroom experience when assigning a grade. Unfortunately, no consensus exists--even within school buildings--about which factors are relevant.
At least now, though, we might have some insight into the roots of grade compression. The research points clearly to what might be called a success orientation in assigning marks. Although teachers really do consider a variety of factors in assigning a final grade, they combine the information in idiosyncratic ways: Not only do different teachers use different elements, they also combine the elements in different proportions. In practice, the success orientation means that, within a classroom, the factors considered in arriving at a final grade are weighed in ways that are most advantageous for each student. For example, in math class, a student who has not mastered fractions may still be awarded a B+ for maintaining a positive attitude, regularly participating in class discussions, and trying hard. On the other hand, an A student who has mastered fractions would usually not be downgraded for being pessimistic, silent during discussions, or "coasting."
Another root of grade compression is the well-documented lack of preparation for educators in matters of assessment at all levels. Competence in assessment is rarely a prerequisite to teacher licensure; the picture is even more grim regarding the training, experience, and requirements for administrators. The recent collaborative work of the American Association of School Administrators, the National Association of Elementary School Principals, and the National Association of Secondary School Principals investigating assessment competencies of educational administrators identified many areas of need.
A final factor must also be considered. Perhaps the most widely misunderstood relationship in education is that between achievement and self-concept. Hosts of pre-service teachers were taught to believe that a key to raising students' achievement was to make them feel better about themselves: "If we build it [their self-concept], their grades will come [up]." Upon landing a teaching job, the bad advice was reinforced. One survey revealed that 85 percent of New York state teachers said that they were pressured by administrators to give higher grades. (See Education Week, April 27, 1994.)
Sadly, though, research into self-concept and achievement has revealed that: a) the relationship between the two is extremely weak and possibly nonexistent; and b) it quite frequently operates in the opposite direction--students who first work to accomplish valuable educational goals experience a consequent rise in how they feel about themselves.
The bottom line is that all of these forces converge to compel us to finally follow the advice our parents gave us: "If you can't say something good about someone, don't say anything." And, because in most cases we are usually able to find something good to say about our students, grade compression results.
Although our parents might be happy that we have at last begun to follow their advice, parents of American schoolchildren may not be so happy. To them, grades are assumed to be indicators of achievement or content mastery. They might not readily comprehend the logic of giving Johnny an A in reading for keeping his desk so tidy.
The glut of A's is beginning to confuse students as well, resulting in distorted interpretations of the relationship between perceived and actual competence. In a study of this relationship, researcher Harold Stevenson found that "[students'] self-evaluation of their skills in reading and math were unrelated to their actual level of achievement." Because the reality of grade compression is never explicitly recognized or codified in school grading policies, students rightfully assume--like nearly everyone else--that their A's and B's mean that they have successfully mastered rigorous academic work.
But they don't. Because of grade compression, grades have nearly lost all meaning. It used to be the case that grades were a delicate mix of criterion- and norm-referenced information. It could be inferred that a student who received a B had both mastered a substantial chunk of the material, and had done so a little better than most. Now, however, everybody gets A's and B's. And there are some who argue that this is a good thing. For example, commentator Alfie Kohn suggests in a recent book, Punished by Rewards, that there should only be two grades: A and "in progress."
Quite possibly, the wish may already have come true. Has anybody noticed that nearly every vehicle has a bumper sticker that boasts about the "honor student" status of the back-seat occupants? High schools are doing away with class rankings and valedictorian status. My own school-age children have a hard time thinking of anybody they know who has flunked anything.
Could this happen in any other field but education? What good would Siskel and Ebert be if they gave every film "two thumbs up"? Why would anyone subscribe to Consumer's Digest if every blender were rated a "best buy"? Would anyone care about the upcoming Olympic Games if every athlete who showed up were given a gold medal?
But this is exactly what is happening in education. Everyone has jumped on the A train. Education critic Charles Sykes has likened the current ethos to the attitude of the Dodo in Alice's Adventures in Wonderland, who cries: "Everyone has won and all must have prizes."
There are three great ironies about this situation. The first is that grade compression is occurring despite evidence that real student achievement is actually becoming more, not less, variable. The gap between college-bound students and those who drop out, don't meet graduation requirements, or don't pass a state-mandated competency test may be wider than ever. Yet everyone gets good grades.
The second is that grade compression degrades our ability to investigate important educational issues. For example, as high school grades become more homogeneous, Scholastic Assessment Test scores appear to be less accurate predictors of success in college; as college grades become compressed, grade-point averages appear to be less strongly related to future job performance, and so on.
The third is that ostensible assessment reforms may actually have contributed to the problem. Assessment innovations have prompted teachers to gather a more diverse array of information about student performance. All the while, questions like "What should be done with all of this information?" or, the more practical question, "How should grades be assigned?" have gone unaddressed in the reform movement. At a time when assessment information is widely hailed as being richer and fuller, grade compression has actually resulted in less information about student performance being communicated to students and parents.
In fact, as communication devices, grades are currently more like two tin cans and a length of string than they are like a cellular phone. It's an interesting technological contrast: As bubble sheets whiz through a scanner in a district testing office, a teacher mulls a pile of papers with stickers and happy faces on them, finally deciding that this student's work merits an B+ for the marking period. A study by Kristie Waltman and David Frisbie of parents' understanding of the information provided on report cards was not optimistic: The authors concluded that, if report cards are viewed as a vehicle for accomplishing a transmission of a teacher's intended meaning to parents, the job is not generally being accomplished successfully.
Can the A train be stopped, now that it has left the station? Perhaps. The answer may even play a part in determining whether American education-reform efforts will likely be effective. Whereas, a centerpiece of many current reform initiatives involves assessment, there will be no educational advantage if the meaning of these modern measures remains murky. Although assessment reforms have introduced a new wealth of assessment information to teachers, parents, and students, our comprehension about how to actually use and report this variety of information may be getting worse, not better. Everybody still gets A's.
The task of combating grade compression is difficult because there is almost no incentive for anyone to address the problem. Administrators know that high grades are good public relations; teachers know that relationships with parents are more congenial when they assign high grades; high grades certainly don't upset students; and everybody is doing it anyway. But all of us must ask ourselves: Can American education abide another "Lake Wobegon" in which grade compression leaves us unsure about students' real performance capabilities and differences--a place where report cards, portfolios, and "alternative" assessments reveal everyone to be above average?
The challenge illustrated by grade compression points the way to several immediate and practical steps that require the involvement of all those interested in reform:
- Educational leaders must develop an "assessment vision." Considering the increasing diversity of assessment purposes, formats, and demands, it is fair to say that the big picture in educational assessment is sometimes a chaotic one, and is perhaps the most neglected issue in the current discourse regarding assessment reform. Educational leaders must promote a clear, coordinated conception--a vision--about the varieties of assessment that occur, the purposes they serve, and the meaning of evaluations of student performance. To be effective in promoting reforms, this vision must be communicated to all interested parties--teachers, parents, community members, and students.
- All educators must make personal commitments to professional development. Because of the widespread lack of training in assessment, professional development in this area should be a top priority. In order to engage in educationally sound evaluation and grading practice, educators need more information about this ignored aspect of assessment reform.
- Information must be relevant to classrooms. Even when teachers and administrators are formally exposed to matters of testing and grading, coursework at the university level often focuses on aspects of testing and grading that may not be applicable to those who must actually do these things. Redesign of college coursework to provide more relevant training is necessary.
- Professional organizations must promote sound evaluation practice. As noted earlier, a few professional organizations have become active in this area, though more work is necessary to highlight the need for and benefits of sound assessment and grading practice.
- Grading policies must be developed and applied consistently. It may be that the system of A's, B's, and C's does not communicate the information we would like about student achievement. Regardless of the system used, administrators, parents, teachers must work together to develop, affirm, disseminate, and maintain consistent grading policies. In order to maximize the meaningfulness of grades, developmental efforts should work to build consensus on the policies from the ground up, listening closely to the information needs of parents, students, employers, and universities.
- End isolationism. Uncertainties and unsound practices can flourish when teachers are isolated from each other. Teachers must take the initiative to begin interaction and collaboration on their grading practices. Administrators can facilitate collaboration and encourage consistency.
- Assessment experts must lend a hand. The proliferation of innovative assessment formats has outpaced development of ways to interpret and report the wealth of information. Experts in testing must explore new ways of synthesizing and communicating the information provided by alternative assessments so as to take full advantage of the innovations.
Possibly the most profound and revolutionary strategy involves beginning to initiate students into a new grading culture. Students have often come to see grades and real learning as disassociated, to prize grades more than education, and to view hard work as unrelated to school success. A true revolutionary, Thomas Paine, captured the essence of the problem, observing, "That which we obtain too easily, we esteem too lightly." A truly significant educational reform will help students see the linkage between mastery of valuable knowledge, skills, and abilities and the grades they receive, and will be one in which real learning is perceived as more desirable than simply the indicator that achievement has occurred.
Vol. 15, Issue 30, Pages 22, 32Published in Print: April 17, 1996, as There's No Such Thing As Grade Inflation