Grade Inflation: A Problem and a Proposal

A national survey released at the beginning of this academic year suggested that public schools are suffering from the same inflation of grades that has beset undergraduate university programs. In a survey of entering college freshmen, researchers at the University of California at Los Angeles found that grade inflation in high school had hit an all-time high. For example, 27 percent of the students reported that their grade averages were A-minus or higher, compared with 12.5 percent who said so in 1969.

At the undergraduate level, several studies have confirmed that the phenomenon exists there to perhaps to an even greater degree. For example, in a study he conducted while at Harvard University, Arthur Levine, the president of Teachers College, Columbia University, found, using a representative sample of 4,900 undergraduates, that the proportion of students with grade-point averages of A-minus or higher almost quadrupled in the years from 1969 to 1993, moving from 7 percent to 26 percent. Ironically, however, 60 percent of the students in the 1993 sample reported the belief that their grade-point average understated the true quality of their academic work.

Ironically, too, many of the best undergraduate institutions are leading rather than resisting this trend. At Harvard, the proportion of undergraduate grades of A-minus or higher increased from 22 percent in 1966 to 43 percent in 1991. The grade of C, which nominally signifies "average" performance, has virtually disappeared; in 1991, over 90 percent of the grades were B-minus or higher. At Smith College, the proportion of A's and B's is 89.3 percent. At Princeton University, 80 percent of undergraduates receive nothing but A's and B's, and at Stanford University, only 8 percent get C's and D's, with none getting F's. At Williams College, 48 percent of the seniors graduated with honors in 1992, compared with 31 percent in 1985.

The less elite institutions, though they perhaps start at a lower point, have generally failed to resist the reverse gravitational pull. For example, a survey of four-year institutions in Virginia revealed that the mean grade-point average climbed from a below-average 1.81 in 1967 (with "average" being C, which equals 2.0) to an above-average 2.67 in 1976. Mr. Levine's study provides evidence that the trend line in recent years has continued upward at these colleges and universities generally.

Yet, during this same period, there is no evidence to suggest that the academic quality of students has gone up proportionately. For example, a study in Tennessee revealed that the state's American College Testing program average score dropped from 19.1 to 17.1 during the period from 1969-70 to 1975-76, and that it subsequently has hovered around 17.5 into the 1980's. On the national level, the verbal and math scores on the Scholastic Aptitude Test had dropped to a notably lower level in 1991, as compared with 1969, causing the College Board to contribute to the problem by artificially raising the average scores. At many prestigious private institutions, admissions standards in recent years have softened due to economic conditions and, according to some sources, cultural diversification.

The trend at Harvard and elsewhere is that high grades are particularly pronounced in the humanities. The average grade-point averages by field at Harvard line up as follows: humanities--between A-minus and B-plus; social-science majors--between B and B-plus; and natural sciences--between B-minus and B. A recent study at Stanford University found that the students in the humanities had the highest grades, whereas those in engineering had the lowest. A 1991 study by two researchers at Williams College of seven prestigious institutions, including Amherst College, Duke University, Pomona College, and the University of Wisconsin, found the same differential drift. The elite institutions are not alone in this internal trend. For example, a national survey of urban, nonresidential institutions found that education and the arts typically were high-grading departments, and physical sciences and mathematics were low-grading departments.

Some academics may argue that the departments that award higher grades do so because they attract exceptionally high-achieving students. But there is no evidence that education and the humanities generally attract superior students. Moreover, in a study at a Virginia college, the difference in grading was, to a notable extent, independent of the students enrolled as majors in the department. More specifically, the researchers found that for the four years studied, five departments, led by education, awarded grades that were, on the average, more than 0.10 grade points higher than those earned by the same students at the same time in other departments. Conversely, four other departments, led by mathematics, awarded grades that were 0.10 grade points lower than those earned by the same students in other departments.

Yet, the Williams researchers found that publicizing such results for their institution had the effect of increasing rather than restraining grade inflation; the low-grading departments eased up to retain or regain enrollments.

Easier grading at the undergraduate level is directly related to student ratings of teaching. Studies in various disciplines have found significant correlation between student ratings of instructors and expected grades of students. Although some academics argue that the higher ratings are entirely attributable to increased academic achievement, the more candid and cogent interpretation is that a significant contributing factor is faculty catering, in terms of grades, to student influence, in terms of ratings. For example, in a national survey of deans of colleges of education and of colleges of arts and sciences, over 70 percent of the respondents agreed that the use of student evaluations as a consideration for promotion and tenure was a major reason for grade inflation. Various studies have provided empirical evidence supporting this quid pro quo hypothesis.

Some observers have attributed grade inflation at the undergraduate level to the generally higher grading norms at the graduate level, where in many disciplines a B has historically been the average grade. Public attention and empirical data about grading practices at the graduate level are much more limited. Although the lay person may not realize that there is a difference between grading at the undergraduate and graduate levels, perhaps based on the general rigor of law schools and medical schools, the common conception among academics is that graduate grades often are approximately half B's and half A's.

To the limited extent that I have been able to pierce the academic and administrative veil that covers graduate grading, it appears that grade inflation is far from confined to the preceding levels of the educational system. It is not without trepidation that I report the data from my own university. Although Lehigh University had generally limited grade inflation at the undergraduate level, with the average grade-point average rising less than .03 from 1979 to 1986, the mean G.P.A. at the graduate level has increased steadily from 3.30 in 1979-80 to 3.47 in 1985-86. Moreover, the mean G.P.A. of the college of education, which is entirely on the graduate level, has risen during this period from 3.60 to 3.71. By 1992, the education G.P.A. had reached 3.74; given the negligible number of C's, the stark reality is that about three-quarters of the grades given are A's.

When I brought the matter to the attention of my colleagues, they were largely unconcerned. Rationales and rationalizations abounded, from academic freedom to student quality. Yet, academic freedom is a very limited legal umbrella. It only applies to grading, if at all, at public institutions, and even in that limited context, it does not bar an administrator from changing the student's grade. Similarly, there is no evidence that student quality in graduate education courses is higher than other disciplines or that it has increased significantly during the past decade. Yet it is at the graduate level that grade inflation particularly threatens the reliability, validity, and credibility of G.P.A.'s.

Consider these anecdotal examples of our college of education faculty members' attitude toward graduate grade inflation:

  • When, for an accreditation review, we tried to do a study of the effectiveness of various admissions criteria, such as the Miller Analogies Test, in predicting academic success in our college, the statisticians advised us that the project was futile because there was hardly any variance in the predicted variable, graduate grades.
  • A colleague of mine taught various courses and published several articles in which he emphatically exhorted school administrators to stem the tide of making teacher evaluation an empty ritual. He cited a study he had done that revealed that approximately 99 percent of the teachers in Pennsylvania had been evaluated at the maximum level (typically 80 points) of the official rating scale. Yet, I later found out that he typically gave approximately 95 percent of the students in his graduate educational-leadership course grades of A.
  • As the then-chair of the college's promotion and tenure committee, I suggested that among the multiple sources of evidence used to evaluate teaching, such as student ratings, peer observations, teaching and testing materials, course loads, and class sizes, we add grade distribution. My junior and senior colleagues' resistance to the idea was both active and passive, effectively stymieing its implementation.
  • When I suggested that the lay public would, if the graph of our grading trends were published, find these data disturbing, several of my colleagues steadfastly maintained that the results were not in the least embarrassing.
  • When I proposed to endow a modest university teaching award for graduate education based on two primary criteria, grades given to students and ratings given by students, they roundly rejected the offer.

Lehigh, however, is far from alone with regard to grading in graduate education.

When the rationalizations are stripped away from the rationales, the basic problem is that high grades are simply easier. It is difficult to initially arrive at and ultimately defend low grades, particularly when the overall academic trend is in the opposite direction. Conversely, in rather indiscriminately issuing high grades, faculty members feel more secure about enrollments, student ratings, and even administrators' attention. The "clients" or "customers" are happy. And just as long as their parents, the graduate schools, the alumni, and the potential employers ignore or are ignorant of the charade of the Lake Wobegon effect (everybody being above average) gone amok, the only limit to grade inflation, unlike monetary inflation, is the 4.0 top of the scale.

Pointing the finger at the rest of society merely mirrors the problem. Citing grade inflation, Hobart and William Smith College's president, Richard Hersh, recently observed that "colleges and universities must accept some responsibility for the culture of neglect, for we have succumbed to the lower[ed] standards of the larger culture."

Among the relatively few academics who have focused on mitigating the problems of grade inflation, some have proposed increasing the number of grading levels in the grading scale. Given the tendency to cluster grades at the top end of the scale, however, the better alternative, at least for graduate education courses, which often purport to follow a competency-based approach, is to reduce the scale to two levels, such as "mastery" and "non-mastery." An additional and more generally applicable solution is to accompany the student's grade received in the course with the average grade assigned by the instructor for that course. Although proposed repeatedly during the past 10 to 15 years--and reportedly in practice at such locales as Canada's McGill University and the University of Toronto--this worthy proposal has received little more than benign neglect at American institutions.

The recent steps taken by American colleges and universities in this area at the undergraduate level have been modest at best. For example, Stanford has re-established the grade of F; Princeton and Yale University have added a grade of D to their pass-fail option; and Lewis and Clark College has revived the D grade generally. At the graduate level, examples are rare: The University of Virginia's law school recently adopted a requirement that professors give their classes no higher than a B average.

Below, I offer a different kind of proposal to supplement or stimulate other steps to combat grade inflation, with schools of education serving as the model, or leading, field:

Offer: An endowment of $10,000 for an undergraduate and/or graduate teaching award based on an index of grades given (i.e., G.P.A.) and ratings received (i.e., average of numerical student ratings of instruction), with the former weighted inversely and the latter weighted directly.

Eligibility: School or college of education at a university with annual teaching awards.

Conditions: Minimum inverse weighting of at least 40 percent for grades given and minimum direct weighting of at least 40 percent for ratings received, with the remainder being peer judgment, via a faculty selection committee.

Deadline: Send a written request for more information to Prof. Perry A. Zirkel, College of Education, Lehigh University, Bethlehem, Pa. 18015 by Monday, April 10, 1995.

