Statistics Agency to Review 'Best High Schools' Data
The National Center for Education Statistics plans to check data on about 5,000 high schools after faulty information from the federal agency led to erroneous rankings for three high schools on U.S. News & World Report’s yearly “Best High Schools” report.
As a part of its rankings, U.S. News uses the Common Core of Data, a rich repository of information on every public school, district, and state education agency in the country. This year’s report was based on data collected in the 2009-10 school year.
However, Jeff Horn, the principal at Green Valley High School in Henderson, Nev., noted that his school’s number 13 ranking was based on federal statistics that mistakenly said his school had 477 students, 111 teachers, and a 100 percent passing rate on Advanced Placement tests that school year. In actuality, the school has about 2,850 students, a student-teacher ratio that is closer to 24 to 1, and an AP pass rate of about 64 percent. Student-teacher ratios and AP pass rates are a part of the magazine’s ranking system.
The Las Vegas Sun wrote about Mr. Horn’s concerns about the data the day after the rankings were released, May 8. Within days, two additional high schools, this time in California, noted that they, too, had high rankings based on bad data:
• Dublin High School, in Alameda County, was listed as having 493 students, a 100 percent AP pass rate, and a student-teacher ratio of 7 to 1. The truth: The school has about 1,650 students and 24 students for every teacher.
• San Marcos High School, in San Diego County, which took the number 11 spot on the U.S. News list, was also listed as having a 100 percent AP pass rate. The Common Core of Data now lists no student demographic data for the high school, but the North County Times newspaper noted that the page originally said there were 79 students in the 12th grade in the 2009-10 school year, compared with 556 students in 9th grade, 554 students in 10th grade and 818 students in 11th grade.
Stephen L. Hanke, the superintendent of the 7,000-student Dublin district, said in an interview that the problem in his district came from bad data generated during a move to a new student-reporting system in both the district and the state. The district notified state officials in December 2010 about the bad numbers, but the corrections apparently were not transmitted to the federal government.
“We’ve got a great high school, and we think we do great things every single day,” Mr. Hanke said. “But this is also a lesson about honesty and integrity.” Robert Morse, the director of data research for U.S. News, has not yet responded to a phone call and email from Education Week seeking comment on the discrepancies.
Tracing the Errors
Marilyn M. Seastrom, the chief statistician for the National Center for Education Statistics, which is the statistical arm of the U.S. Department of Education, said fail-safe mechanisms at the federal level somehow did not pick up on the data problems.
When data is first submitted to the federal government, it goes through a “high-level” edit by the Education Data Exchange Network, which is supposed to check for major discrepancies, like school or district populations that change in a major way from year to year.
The data is then examined a second time by statisticians at the NCES, in partnership with the U.S. Census Bureau.
In the case of Nevada, Ms. Seastrom explained that the state submitted data for its schools and districts that went through the two-part clean-up process, and was told that data on one charter school needed to be fixed.
Instead of submitting data on that one school to the federal government, the state had to submit an entirely new data set, she said. But this time, there were additional errors. Somehow the first-level software did not pick up on the problems, and federal officials, under pressure to meet a deadline, didn’t check all the numbers again.
“Lesson learned: Unless you have a system where you can only change one variable, you need to check all the numbers,” Ms. Seastrom said. That problem appears to have affected six schools in Clark County, Nev., though only one made the U.S. News list.
Programmers are working to trace the software error that didn’t pick up on the problem, Ms. Seastrom said. Additionally, the federal government is now planning to release data in three waves, to allow states to look it over for major problems. And it plans to check the data for the approximately 5,000 schools on the magazine’s list, which includes the top nationwide rankings, as well as rankings of schools in individual states, she said.
Many organizations rely on the Common Core of Data for educational policy research. However, Steven M. Glazerman, a senior fellow at Princeton, N.J.-based Mathematica Policy Research, said that the data set has recognized limitations.
Mathematica, which conducts in-depth social policy reports on behalf of the Education Department and others, typically collects its own information, Mr. Glazerman said. But hypothetically, for a report on urban districts, the organization might compare the information it gathers to all urban districts nationwide, using the Common Core of Data for that urban district average. Or, it might use the federal data set as a way to develop a pool of districts to study. For example, it might use the Common Core of Data to produce a list of districts with 20 or more elementary schools.
“That kind of data doesn’t tend to change a lot at the margins,” Mr. Glazerman said. “I would say there’s nothing surprising here if you routinely work with data. You have your antenna up for a certain amount of what we call ‘data slop.’ ”
In an “ambitious” project like U.S. News’ attempt to rank schools, which the magazine has been doing for the past four years, “small errors can really throw things off,” Mr. Glazerman said.