Closing the Measurement Gap
Why ‘risk adjustment’ could work for education.
In June, the U.S. Department of Health and Human Services released a report identifying high-mortality hospitals for heart patients. What’s striking about the report is not the number of hospitals on the list, but how few there were.
Only 41 hospitals—less than 1 percent of all hospitals nationwide—were identified as high-mortality. Yet in the 2004-05 school year, 26 percent of American schools landed on education’s comparable list—those that did not make adequate yearly progress under the federal No Child Left Behind Act.
How is it possible that education has so many more organizations on its failing list? It’s not that there is a performance gap between schools and hospitals. The trouble is the profound measurement gap between education and medicine.
Medicine makes use of what is known as “risk adjustment” to evaluate hospitals’ performance. This is not new. Since the early 1990s, states have rated hospitals performing cardiac surgery in annual report cards.
The idea is essentially the same as using test scores to evaluate schools’ performance. But rather than reporting hospitals’ raw mortality rates, states “risk adjust” these numbers to take patient severity into account. The idea is that hospitals caring for sicker patients should not be penalized because their patients were sicker to begin with.
In practice, what risk adjustment means is that mortality is predicted as a function of dozens of patient characteristics. These include a laundry list of medical conditions out of the hospital’s control that could affect a patient’s outcomes: the patient’s other health conditions, demographic factors, lifestyle choices (such as smoking), and disease severity. This prediction equation yields an “expected mortality rate”: the mortality rate that would be expected given the mix of patients treated at the hospital.
While the statistical methods vary from state to state, the crux of risk adjustment is a comparison of expected and observed mortality rates. In hospitals where the observed mortality rate exceeds the expected rate, patients fared worse than they should have. These “adjusted mortality rates” are then used to make apples-to-apples comparisons of hospital performance.
Accountability systems in medicine go even further to reduce the chance that a good hospital is unfairly labeled. Hospitals vary widely in size, for example, and in small hospitals a few aberrant cases can significantly distort the mortality rate. So, in addition to the adjusted mortality rate, “confidence intervals” are reported to illustrate the uncertainty that stems from these differences in size. Only when these confidence intervals are taken into account are performance comparisons made between hospitals.
Both the federal and state governments have acknowledged that we must take the mix of patients into account to fairly compare hospitals—so why not do the same in education?
Though the distance between the federal departments of Health and Human Services and of Education represents only a brief jaunt down Washington’s C Street, these agencies’ ideas of how to measure organizational performance are coming out of different worlds altogether. Consider the language used by the Health and Human Services Department to explain the rationale behind risk adjustment:
“The characteristics that Medicare patients bring with them when they arrive at a hospital with a heart attack or heart failure are not under the control of the hospital. However, some patient characteristics may make death more likely (increase the ‘risk’ of death), no matter where the patient is treated or how good the care is. … Therefore, when mortality rates are calculated for each hospital for a 12-month period, they are adjusted based on the unique mix of patients that hospital treated.”
What’s ironic is that the federal government endorses entirely different ideas about measuring performance in education. In medicine, governments recognize that hospitals should not be blamed for patients’ characteristics that they do not control. When educators make this point, they are censured for their “soft bigotry of low expectations.” In medicine, the government acknowledges that no matter how good the care is, some patients are more likely to die. In education, there is a mandate that schools should make up the difference, irrespective of students’ risk factors.
Perhaps most of us could agree that if the sole goal of accountability systems is to compare organizations, risk adjustment is the most fair and accurate approach. When patients in a hospital die, we rarely claim that the hospital is failing. By the same token, the fact that some students in a school are failing does not mean that the school is failing. Sound public policy should be able to distinguish these two conditions.
Nonetheless, even if a school is not performing worse than expected given the unique risks of the students it is serving, there is still a compelling public interest in remedying the situation. Accountability systems do not seek only to measure organizations’ performance. They have the additional goal of spurring schools to improve the quality of education provided and to lay an equal playing field, especially for poor and minority children. We have rightly seen tremendous political opposition to the idea of lowering expectations based on students’ social backgrounds.
By separating the issue of measurement from the issue of social goals, risk adjustment has the potential to break the stalemate in this argument. Risk adjustment could make accountability a two-stage process, in which we first ask the question, “Is the school responsible?” By all means, when schools are performing worse than they should be, given the population they are serving, holding educators’ feet to the fire is a reasonable response.
But when a school is not a low-performing outlier, it makes no sense to castigate educators and hope for the best. We’re simply not going to get the best results for our kids that way. Instead, funding for a robust set of interventions should be made available to these schools. It is the task of education research to figure out which interventions give us the biggest bang for our buck. The key point, however, is that we cannot expect to get the same results for disadvantaged kids on the cheap.
Which out-of-school factors should be part of the risk-adjustment scheme? Any factors out of the school’s control that could affect student outcomes. Some of these might include prior achievement, poverty, age, race and ethnicity, gender, attendance, type of disability, residential mobility, and whether the student is an English-language learner. Attention would need to be paid to the issue of “upcoding,” through which a school could create the appearance of a more disadvantaged population than truly existed. In medicine, careful monitoring and periodic audits have gone a long way to keep upcoding in check.
The most common objection to risk adjustment is that it lets educators off the hook. This is far from the truth. Risk adjustment places blame where it belongs. In some cases, this will be at the feet of schools and teachers. However, risk adjustment would likely reveal that, in large measure, we can blame our city, state, and federal governments for failing to make sufficient investments in public education to produce the results they demand.
If we close the measurement gap, we can begin a radically different conversation about what investments in public education are necessary to give disadvantaged kids a fair shot. The likely educational dividends for these children make risk adjustment a risk worth taking.
Vol. 27, Issue 10, Pages 25, 27Published in Print: October 31, 2007, as Closing the Measurement Gap