The use of standardized tests as a measure of student success and progress in school goes back decades, with federal policies and programs that mandated yearly assessments as part of state accountability systems significantly accelerating this trend in the past 20 years. But the tide has turned sharply in recent years.
Parents, advocates, and researchers have increasingly raised concerns about the role of testing in education. The shift in people’s attitudes about the use of tests and about the consequences of relying (or possibly over-relying) on test scores for the purposes of both school and teacher accountability raises the question: What can tests tell us about the contributions of schools and teachers to student success in the future?
We think it is important to ask this foundational question: How much do we know about whether there is a causal link between higher test scores and success later in life? After all, that is the purpose of education—preparing students to be successful in the future. We explored this question and the role of tests in a recently published article in Educational Researcher. We conclude that any debate about the use of test scores in educational accountability should: (1) consider the significant evidence connecting test scores to later life outcomes; (2) take into account the difficulty of establishing causality between test achievement and later life outcomes; and (3) consider what alternative measures of success are out there and how reliable they are.
What can tests tell us about the contributions of schools and teachers to student success in the future?"
It is certainly reasonable to argue that we should hold schools and teachers accountable for the test performance of their students, but we likely care a whole lot more about tests if they reflect increased learning in school that translates into future success.
There is a vast research literature linking test scores and later life outcomes, such as educational attainment, health, and earnings. These observed correlations, however, do not necessarily reflect causal effects of schools or teachers on later life outcomes. Maybe students who do well on tests are the same students who wake up early in the morning, go to work on time, and work hard, and that’s the reason for their success, not necessarily what they learned in school. Also, differences in test scores could reflect differences in learning opportunities outside of school, including the supportiveness of families or the communities in which students live.
What we do know more definitively about the causality of this relationship comes from a limited number of studies that examine the effects of different educational inputs (for example, schools, teachers, classroom peers, special programs) on both student test scores and later life outcomes. For instance, if a study finds test-score impacts and adult-outcome impacts that are in the same direction, this could be regarded as evidence that test scores (and the learning they represent) have an impact on later life outcomes.
Our view is that studies that might be considered causal do tend to find alignment between effects on test scores and later life outcomes. Perhaps the most influential studies in this strand were published in 2014 by Raj Chetty, John Friedman, and Jonah Rockoff, who found that students who were assigned to teachers deemed highly effective learned more as measured by tests and also were more likely to have better adult outcomes, such as attending college and earning higher salaries.
Another study by Chetty and co-authors examines the long-term effects of peer quality in kindergarten (once again, as indicated by test scores) using the Tennessee Student/Teacher Achievement Ratio experiment. The 2011 study finds that students who are assigned to classrooms with higher achieving peers have higher college attendance rates and adult earnings. Similarly, using that same Tennessee STAR experiment, a study by Susan Dynarski and colleagues that same year looks at the effects of smaller classes in primary school and finds that the test-score effects at the time of the experiment are an excellent predictor of long-term improvements in postsecondary outcomes.
It is also important to recognize that we might not always expect test-score effects of educational interventions to align with adult outcomes. It is easy to make the case that interventions can improve later life outcomes without affecting the cognitive skills of children. Choice schools may, for instance, have stronger pipelines into college, leading to better college-going results while not affecting learning and test results, but we don’t know this conclusively.
Irrespective of one’s views on the degree to which tests predict later life outcomes, we need to think carefully about what abandoning the use of test scores altogether might mean for education policy and practice. From a practical perspective, we can’t wait many years to get long-term measures of what schools are contributing to students. This does not mean that test scores ought to be the exclusive or even primary short-term measures, but if one believes in some form of educational accountability, it is important to consider what alternative measures of success are out there and how reliable they are.
Lessening the weight of tests in accountability calculations is consistent with ESSA, but there are concerns about how “gameable” many of the alternative measures might be. And there is no doubt that we know less empirically about the causal connections between many of these alternative measures and long-term student prospects.
For example, are students assigned to teachers who get good classroom observation ratings likely to have better future prospects? Perhaps, but there is less evidence about this type of measure than there is about test-based measures. And if we do not use test scores in teacher evaluations at all, are we going back to the era of teacher accountability when 99 percent of all teachers across the country were rated satisfactory or better?
People clearly have strong feelings about the worth of—and the harm done by—testing. But whatever our personal feelings, we need to evaluate the power of test scores to predict the outcomes we want for our students and consider what the alternatives might be.