Assessment Opinion

Better to Be Lucky Than Good: The Persistent Gender Gap in Standardized Testing

By Tom Segal — July 19, 2013 7 min read
  • Save to favorites
  • Print

Guest post written by Scott Schonberger, thinker of thoughts, taker of notes, chopper of wood. Scott’s work includes previous musings on this here blog.

Near the end of junior year or at the beginning of senior year, over 3.2 million lucky high school students from all across the country, will take a standardized test for college admission. These students will take either the Scholastic Assessment Test (SAT) or the American College Testing (ACT) exam and, at a majority of colleges and universities, the scores on these standardized tests will play a major role in determining student admissions. The results from these tests are supposed to be indicative of a student’s first year performance in college, thus providing admissions offices with an additional metric for stratification. Proponents of standardized testing argue that the tests allow colleges to look at everyone on an equal playing field while opponents feel that standardized testing is a poor representation of a student’s ability.

Since 1972, boys have overshadowed girls on the SAT, registering higher overall scores each and every year by an average of 45 points. The main driver of the overall scoring gap is the math section of the test, where boys have outscored girls by an average of 38 points. This gender gap persists despite girls attending college in larger numbers and generally achieving better freshman year grades than boys with the same SAT scores (1,2). The existence of the gender gap in test scores is clear, what is opaque is why the gap exists and what can be done to close it.

The College Board, a non-profit organization that publishes the SAT, has been tracking scoring statistics for the exam since 1972. The scores for 2012 are listed in the table below:

This table could lead some readers to believe that 2012 was just a bad year for girls. Actually, 2012 was an above average performance for girls in terms of how far their scores have lagged behind their male counterparts.

Since tracking began, girls have recorded a yearly average score of 488 compared to a yearly average score of 526 for boys, a 38-point difference. The gap has remained remarkably consistent over time, with male and female scores following the same peaks and valleys.

Making note of this consistency is important, it lends credence to the idea that the score difference results from factors inherent in the test and not a bias written into particular test questions. Test years in which the boys’ score average dropped relative to the previous year also appear to manifest similarly on the girls’ side. In years where girls’ average scores were higher, boys’ scores went up as well. The easiest, and ignorant, conclusion to draw from this relationship is that girls are just worse at math than boys, but data outside of the SATs doesn’t support that.

During their freshman year “females, on average, earned higher grades in their first-year mathematics courses [than males] (mean = 2.79 versus 2.63 for males)"(5) . In letters this means that female students are, on average, earning a B- while their male counterparts earn a C+. Overall grades in math courses tend to be heavily centered on testing, making the comparison to the SAT math section apt.

Casting aside the idea that a conspiracy exists to imbed a fixed number of questions each year into the test that girls will most likely get wrong, we are left with the a perplexing situation. If girls are higher achievers in the classroom, why do they consistently score lower on the SAT?

The first possibility could be that the SAT is not an accurate predictor of mathematical ability. Even if that were true, it still wouldn’t explain the consistent gender gap. Poor predictor or not, some fluctuation in the scoring gap would be expected.

A second possibility is, simply, the belief that boys are better at math than girls. In the past this may have been true, boys were “better” at math. “Better,” because research has shown that social construct has a profound ability to affect academic achievement. For most of this country’s history (and some would argue that it continues into the current day) there was a general acceptance that math was a place for men. Constantly reminding girls of this idea may have created a self-fulfilling prophecy, in which girls consistently under achieve in math.(6) However, in the past decade this prophecy appears to have been undone and girls are now achieving at the same levels as their counterparts.(7)

A third possibility is that the gender gap results from the construction of the SAT itself. The logic behind this has to do with gender specific responses to risk and the way SAT scoring rewards risk taking. There are three possible ways to answer a question on the SAT: correct, incorrect and no-answer. Correct answers are worth one point; incorrect answers are worth -.25 points and no-answers are worth zero points (all raw score points). The SAT math section is presented in a multiple-choice format; each question has a single correct answer and four incorrect answers. This means that if a test taker is able to eliminate at most one incorrect answer, the cost benefit ratio shifts in favor of guessing because there is now a .25-net-point net gain to be had (1 point for a correct answers and -.25*3 for incorrect answers). If boys are more prone to guessing at a question they cannot answer and girls are prone to skipping the same question, this would skew test results in favor of boys. This would also mean boys are achieving a better score on the SAT because of a loophole in the scoring system, not mathematical ability.

There is a fair amount of literature on the relationship between gender, risk taking and exams. James Byrnes, David Miller and William Schafer published a meta-analysis of 150 studies that looked at the risk-taking tendencies of men and women.(8) The results of the study showed that across the board men were more likely to engage in risk-taking activity and the largest gender differences were seen in the areas of physical skill and intellectual risk. Boys being more likely to guess on an exam would explain the contradiction between math performance on the SAT and math performance in a school setting. While the SAT rewards taking an educated guess through a favorable scoring formula, a college exam does not. Unless a student has a very eccentric teacher, a correct or incorrect answer on an exam will have the same absolute value.

If skewed scores on the SAT are the result of an inadequate scoring system, the easiest solution would be to change it. Instead of a score building from a base number (like 0), every exam taker could begin taking the test with a score that is equivalent to the 50th percentile.(9) By doing this each student is “average” when they initially sit down. Under this format, all correct answers move a test taker closer to the top, every incorrect move a tester closer to the bottom and unanswered questions continue to have no impact.(10) This would reward the test takers who get the most correct answers and penalize those that are only good guessers. Thus, when the student leaves the exam there score will reflect how far they have deviated from the 50th percentile.

Some may say, “Why does this matter? Girls are attending college at a higher rate than ever.” My response would be that this argument is about quality, not quantity. An artificially created scoring gap on the SAT may not prevent someone from going to college, but it will affect the quality of the school they get into. And, even though more than 800 colleges do not require test scores for admission (11), that is only 29% of all colleges in the United States.(12) An incredible amount of work has been done during the past decades to even the academic playing field and it has begun to pay dividends. However, allowing one of the primary measures for college admission to continue unchanged when there is clear evidence that it is flawed is unconscionable. The chance of standardized testing for college admission being done away with anytime soon is low, everything should be done to make sure the test is fair.

1) SAT revision stokes fears of wider math-gap | womens eNews Retrieved 7/19/2013, 2013.
2) Mattern, K. D., Patterson, B. F., & Kobrin, J. L. (2012). The validity of SAT scores in predicting first-year mathematics and English grades. ( No. 1).CollegeBoard.
3) The SAT® report on college & career readiness: 2012 | research and development Retrieved 7/19/2013, 2013.
4) Ibid.
5) The Validity of SAT scores, Mattern.
6) Are boys better than girls at maths? -- PsyBlog Retrieved 7/19/2013, 2013.
7) Hyde, J. S., Lindberg, S. M., Linn, M. C., Ellis, A. B., & Williams, C. C. (2008). DIVERSITY: Gender similarities characterize math performance Science, 321(5888), 494

495. doi:10.1126/science.1160364
8) Byrnes, J. , Miller, D. , & Schafer, W. (1999). Gender differences in risk taking: A meta-analysis. Psychological Bulletin, 125(3), 367-383.
9)I believe this is how the GRE is scored
10) Test takers would have to be required to answer a minimum number of questions to have a recordable score
11) SAT/ACT optional 4-year universities | FairTest Retrieved 7/19/2013, 2013.
12) Fast facts - national center for education statistics. Retrieved 7/19/2013, 2013.

The opinions expressed in Reimagining K-12 are strictly those of the author(s) and do not reflect the opinions or endorsement of Editorial Projects in Education, or any of its publications.