Ten years ago this month, President George W. Bush signed into law the No Child Left Behind Act, setting the stage for a new—and more aggressive—phase of accountability in American education.
The United States isn’t alone in promoting accountability in elementary and secondary education. The notion in recent years has become a global phenomenon among nations looking to improve their school systems.
What accountability practices look like in other countries, however, varies considerably, from publicly reporting school results on assessments to conducting school inspections and administering high-stakes “gateway” exams that play a big role in determining students’ academic and career prospects.
Experts say the U.S. approach appears to be something of an outlier, at least as defined in the No Child Left Behind era, with the main focus on grade-by-grade standardized testing that drives an escalating set of sanctions for schools that fail to meet specific achievement targets over time.
“Right now, it’s very much an American experiment,” says Yong Zhao, an education professor at the University of Oregon who has studied comparative education and is a critic of NCLB testing and accountability provisions. He was a member of the Quality Counts 2012 advisory panel.
At a time when American policymakers are rethinking elements of accountability, and amid increased concern about how U.S. students stack up globally, questions arise about how the American approach compares and contrasts with that of other nations and whether there are practices from abroad to consider.
One such possibility: school inspections. A number of countries, such as England, the Netherlands, New Zealand, and Singapore, require these external reviews to help gauge academic quality and hold schools accountable for improvement.
Also, as the idea of making student achievement a central element of teacher evaluations appears to be building steam in the United States, experts note that the professional judgments of supervisors and sometimes of peers, rather than test scores, remain the mainstay in most developed nations, including top-achievers. Indeed, some analysts say the most powerful lessons may well come from places like Finland and Singapore that have taken a comprehensive approach to ramp up the quality of their teaching forces. (“Among Top-Performing Nations, Teacher Quality, Status Entwined,” this issue.)
Meanwhile, the main weight of accountability in many industrialized nations tends to fall on students, rather than schools or teachers, as seen through the gateway-exam systems common in Europe and East Asia.
“We go very heavy on school accountability,” says Tom Loveless, a senior fellow at the Brookings Institution, a Washington-based think tank. “The rest of the world is fairly light on that but is much heavier on student accountability, where they hold students accountable for what they’re supposed to know, and there are consequences attached to that, such as the track or stream they’re placed into.”
Although a gateway system may not be deemed politically feasible or desirable in the United States, some observers suggest it’s worth exploring ways to foster more student accountability.
The design of educational accountability in the United States has long been the subject of fierce debate, especially in the years since the NCLB law’s enactment. A central concern is the perceived rigidity of the law’s mandates for identifying low-performing schools and the steps required to intervene, including corrective actions and restructuring that may involve removing teachers or converting to a charter school.
In addition, many educators and analysts say U.S. policymakers rely too heavily on standardized tests to measure student learning and school quality. The pushback is especially pronounced given the widespread belief that the tests most states administer for accountability purposes under the law provide limited information on student achievement.
Daniel Koretz, a professor at the Harvard Graduate School of Education, laments the intense “weight and faith given to test scores” in the United States.
“You generally don’t find people [in other countries] saying, ‘We’re going to impose a 90-minute math exam and we’re going to evaluate a school based on that,’ ” he says.
Although U.S. Secretary of Education Arne Duncan has made clear that he sees testing as a vital ingredient to accountability, he has pushed to improve the quality of assessments used for that purpose, saying the nation must move beyond “fill-in-the-bubble” tests that measure basic skills.
Fueled by more than $350 million from the federal Race to the Top program, two state consortia are developing common assessments—pegged to the common standards in English/language arts and math recently adopted by most states—intended to be more rigorous and to better evaluate learning.
Speaking at a global education forum last year, Duncan said the work to devise strong common standards and aligned assessments reflects a “sea change” in American education in line with top-achieving countries. The new exams, he said, “will test higher-order thinking skills, much like the high-quality assessments used overseas.”
Duncan observed: “High-performing nations may differ on how they assess learning. Yet every top-performer is using data in one form or another to inform instruction and to monitor and improve performance.”
Some key aspects of U.S. federal policy on testing and accountability appear to be unusual in the global sphere. For one, analysts say that national or state standardized tests typically occur far less often overseas than is required under the NCLB law, which calls for annual testing in grades 3-8 and once again in high school.
Another apparent outlier is the U.S. mandate to disaggregate performance data at the school level by student subgroups, including race, ethnicity, and income status.
“I don’t know of any other countries that do that,” says Sir Michael Barber, a former top education adviser to then-British Prime Minister Tony Blair and a student of education globally. “Given the history of race and civil rights in the United States, I think that is unusual, and I personally think it is important. It really puts the [achievement] gaps on the agenda.”
Many nations do, however, make school-level achievement data publicly available.
The United States benchmarks against three major international tests:
The Program for International Student Assessment, or PISA, administered by the Organization for Economic Cooperation and Development, evaluates reading, math, and science among 15-year-olds every three years. It is considered to use more open-ended and essay questions than other international tests and requires students to transfer knowledge or skills from one content area to another. Seventy-four countries and jurisdictions participated in the most recent assessment.
The Trends in International Mathematics and Science Study, or TIMSS, administered by the International Association for the Evaluation of Educational Achievement, evaluates math and science in 4th and 8th grades every four years. In 2011, 78 countries and states participated. It is considered to have a structure more similar to that of the National Assessment of Educational Progress, or NAEP, although it puts different weight on content areas.
The Progress in International Reading Literacy Study, or PIRLS, also administered by IEA, evaluates reading among 4th graders every five years. In 2011, 57 countries and states participated in the test.
The United States’ performance on international tests is often described as middle-of-the-pack. The country’s showing on nation-by-nation comparisons varies somewhat, depending on the exam and the academic skills it measures.
SOURCE: Institute of Education Sciences
Australia, which launched national exams in 2008, recently unveiled a federal website with a snapshot of individual schools, including test results as well as a comparison of any given school’s achievement with others that have similar student demographics.
Although analysts say tying penalties to schools based on test scores is unusual, that is not to say nothing happens with struggling schools overseas.
Many nations use test data to “guide intervention, reveal best practices, and identify shared problems ... in order to encourage teachers and schools to develop more supportive and productive learning environments,” the Paris-based Organization for Economic Cooperation and Development explains in a 2010 report, “Strong Performers and Successful Reformers in Education.”
“We don’t declare our schools to be failing,” says Ben Levin, an education professor at the University of Toronto and a former deputy education minister in Ontario. “But we differentiate our support for schools based on their level of performance. ... So if you’re in a school that has not very good performance, you’re going to get more support both from the district and the province.”
In Singapore, notes a 2005 report from the Washington-based American Institutes of Research, schools use a national exam to identify upper-elementary students who struggle in math. Those students receive specialized instruction based on an adapted curriculum, as well as more instruction so that they can cover the same rigorous content, only at a slower pace, the study says. (Singapore also provides financial rewards to schools that show better-than-expected performance on value-added measures of school outcomes, according to the study.)
The United States’ closest cousin when it comes to school accountability may well be England, experts say. In addition to publicly reporting achievement data down to the school level, the country sets “floor targets” for schools based on national tests at the end of primary school and again in secondary school, though those results do not take into account student demographics. A school’s failure to meet the targets can result in government-mandated intervention and possible takeover, closure, or conversion into a government-managed academy.
(In 2010, about one-quarter of England’s primary schools boycotted the exams, according to the BBC, citing concern about pressure to “teach to the test” and frustration with the news media’s use of the results to rank schools based on achievement in so-called league tables.)
But England brings another dimension to accountability lacking in the United States: a national school inspection system. Such systems exist in a number of countries, especially in Europe, and in some instances date back more than a century.
Craig D. Jerald, a Washington-based education consultant who recently wrote a report on the English inspectorate, the Office for Standards in Education, Social Services and Skills—known as OFSTED—suggests that this approach may be a promising option for states looking to move beyond a simple reliance on test scores.
“We need to bring expert judgment into school evaluation and accountability, and one way to do that is inspection,” he says. “It’s a way to handle a multiple-measure approach to evaluating schools. You can either hand that over to a spreadsheet or a trained expert.”
Under the English system, inspectors typically visit a school for two days. Schools are rated “outstanding,” “good,” “satisfactory,” or “inadequate.” Test data are used in the evaluation, but so are other factors, including classroom observations to determine the quality of instruction. Schools rated inadequate can be placed into “special measures,” which involves developing an improvement plan and more regular inspections. If the school fails to improve, more-severe consequences may follow, such as replacing the principal or closing the school.
England last fall revised its inspection framework, amid concern that the inspections have focused on an overly lengthy list of topics. The new framework narrows the scope to four areas: student achievement, teaching and learning, school leadership and management, and standards of behavior and safety. A key objective, OFSTED explained, was for inspectors to spend more time observing classrooms, including listening to children read in primary schools, assessing their progress, and observing student behavior.
At least a few U.S. school systems have recently tried conducting formal inspections, including in New York City, Charlotte, N.C., and Sacramento, Calif.
“We wanted to give a qualitative assessment of a school, and to do that, we developed a highly specific rubric,” said Jerry Winkeljohn, the former director of school improvement in the 134,000-student Charlotte-Mecklenburg district, which halted the inspections last year amid budget cuts.
Melanie Ehren, an education researcher at the University of Twente, in Enschede, the Netherlands, who has studied school inspections, said that given the limits of testing, inspections could be a powerful tool for the United States to home in on “instructional quality.” Given the expense involved, she said, a state might consider only visiting struggling schools or those deemed at risk of falling behind. In fact, her country just instituted such a policy for “risk-based” inspections as a cost-saving measure, she said.
Use of Testing
Experts say one core dimension of accountability lacking in the United States is the use of high-stakes, government-sponsored gateway exams.
“Virtually all high-performing countries have a system of gateways marking the key transition points,” such as from basic to upper-secondary education and from upper-secondary education to university, writes Marc S. Tucker, the president of the National Center on Education and the Economy, a Washington-based research and advocacy group, in a 2010 report, “Standing on the Shoulders of Giants.”
Such exams, which typically are set to national standards and derived from a national curriculum, create strong incentives for students to work hard and take tough courses, explains Tucker, who served on the advisory board for Quality Counts 2012. “Students who do not do that will not earn the credentials they need to achieve their dream, whether that dream is becoming a brain surgeon or an auto mechanic.”
John H. Bishop, a Cornell University professor who has studied gateway exams, said the pressure has ripple effects. “It automatically produces stakes for the teachers, even if there is nothing formal about it,” he says.
In a 2005 article, Bishop noted that the high school exit exams in many countries, typically developed by the education ministry, last two weeks or more, with the curriculum-based tests for each subject lasting three hours or longer. They generally require students to write essays, describe science experiments, and show how they solve multistep mathematics problems, he explained. Also, the exams usually signal different levels of achievement, not just whether a student has met a minimum standard.
But many U.S. analysts are skeptical of importing European- or Asian-style gateway exams.
“I think we have a wonderfully different system,” says Marshall “Mike” Smith, a former U.S. deputy secretary of education and a visiting scholar at Harvard University. “We have multiple opportunities for kids to get to college, and I think that’s one of our greatest strengths.” (“Even With Educated Workforce, U.S. College, Career Issues Loom,” this issue.)
Some external exams do count for U.S. students, but there are significant differences. First, nearly half of states have exit exams students must pass to graduate, but analysts say they generally set a low bar. Also, U.S. universities generally don’t consider the results in making admissions decisions. The privately run SAT and ACT certainly are high-stakes exams, but those voluntary tests are not directly connected to the curriculum, and analysts say schools feel little responsibility for student performance on them.
In any case, some observers suggest the United States could benefit from more incentives for students to take high school assessments more seriously. In fact, the effort to develop common standards may create a new avenue. A variety of higher education institutions have signaled that they would recognize in making placement decisions the high school exams being crafted by two state consortia as part of the common-standards initiative. They would not determine admission, but would allow students to skip the remedial courses many universities require some students to complete.
“We’ve always thought about these college-ready assessments as door openers, not door closers, not the same as exit exams,” says Michael Cohen, the president of Achieve, a Washington-based group working with the Partnership for Assessment of Readiness for College and Careers, or PARCC, a consortium composed of 23 states, plus the District of Columbia.
Tests and Incentives
Meanwhile, teacher evaluation has become a core theme in U.S. discussions of accountability, with a push to base those evaluations—as well as decisions on pay, tenure, or dismissal—at least in part on student test scores.
A 2011 OECD report notes that across the organization’s 34 countries, teachers are judged, and in some cases rewarded, on a range of criteria. They include qualifications, how teachers operate in a classroom setting (such as attitudes, expectations, and instructional strategies), and measures of effectiveness. Instruments include standardized assessments, classroom observations, teacher interviews, and parent ratings.
Andreas Schleicher, the OECD’s education director, says Singapore has developed an especially “systematic and thoughtful” evaluation system.
“They have a range of criteria that feed into this judgment, including test scores, including professional judgments, including inspections,” he said. “There are a lot of things you need to do as a teacher to demonstrate performance.”
As an incentive, Singapore awards performance bonuses to teachers.
Some observers also note that informal accountability, where teachers feel a professional responsibility to one another, plays a powerful role in countries such as Singapore and Japan.
Stepping back, many analysts caution against the temptation simply to cherry-pick isolated elements of another nation’s education system. There are important structural factors of the U.S. system to keep in mind, not to mention social and cultural differences, and political realities.
That said, a variety of observers say the United States may be reaching a pivotal point on educational accountability, especially with recent efforts, following previous false starts, to revise the NCLB law. With that in mind, this could be a time to test out some different approaches, such as school inspections, says Jerald.
“A certain number of states could try it out to get a sense of whether inspection can translate well in the United States,” he says. “It would be a very different approach.”
Jerald adds: “After 10 years of a very standardized approach to accountability that has had its advantages and disadvantages, it’s time to experiment a little bit.”