Matching Up States, Countries Offers Fresh Perspective
Comparing Finland and Minnesota may be more apt than looking nation to nation, testing experts say, but the analysis needs to go beyond just scores
As concern over America's competitiveness abroad intensifies, education officials in the U.S. are beginning to consider using individual states and districts—not just the nation as a whole—as the units against which to measure their international peers.
In everything from population demographics to curriculum adoption, a country like Finland may be more comparable to an individual state like Minnesota than it is to the heterogeneous expanse of the United States—leading some policymakers and researchers to reason that such state-to-country comparisons can better highlight educational practices.
Yet education and testing experts warn that if such comparisons are to be useful, educators must go beyond basic test rankings to understand how countries' specific policies and practices can make U.S. students more competitive. Some states, such as Massachusetts and Minnesota, are already comparing both their student achievement and educational practices to those of other countries via international tests and other studies.
International assessment expert Gary W. Phillips of the American Institutes for Research believes most states will eventually have to participate in tests like the high school-level Program for International Student Assessment and the 4th- and 8th-grade-level Trends in International Mathematics and Science Study in order to compete for global businesses. He likened it to the evolution of the National Assessment of Educational Progress, often known as "the nation's report card," which started primarily as a research tool and now is used as a pacesetter in state accountability report cards. "When NAEP first started, each state was living in its own little Lake Woebegone world, not knowing how it stacked up to other states," he says.
"Now the same is true internationally. Around the world a lot of these countries are eating our lunch. They are focusing on education in a way that we aren't," Phillips says. "We have to know how we stack up. That's why these studies are important; they allow us to benchmark what we do and know with what they know and do."
Andreas Schleicher, the head of education indicators and analysis programs for the Organization for Economic Cooperation and Development, administrator of the PISA, said projects to compare students internationally can help America catch up to the global norm: Countries such as Germany, the United Kingdom, Belgium, and Canada, along with most other nations with federal forms of government, already compare their states or provinces to international testing benchmarks.
"If you go to Canada, the kinds of policies and practices that Alberta has put in place are very different from those that Newfoundland has put in place, or Ontario," Schleicher says. "It produces much better policy insights to compare [state or provincial] education systems, than nations'. That's where systems differ, often significantly. In my country, Germany, the northern states and the southern states have quite different education systems. It makes much more sense to compare those systems [individually], than to mix them up in the aggregate."
Researchers such as Phillips and economist Eric A. Hanushek, of Stanford University, long have used statistical analyses to try to link a student's proficiency on an American test such as NAEP to international tests, with varying degrees of success. All of these tests are given in different years, cover different topics within the same subject, and different groups of children—PISA tests by age, for example, while NAEP and TIMSS test by academic grade.
"To date the only truly valid way to compare at [the state] level is you have to actually administer the international assessments in the states"—which no states now do statewide with PISA, Phillips says. "It's more time consuming, but it is not based on an edifice of assumptions."
State as 'Country'
But how can a single state compare itself to a sovereign nation? In some ways, it's easier to do, both in testing and analysis, than comparing nations. A state needs a smaller sample of students to get a representative group to study than America as a whole requires, and individual states can think more about how adding a new test will fit into their schools' academic schedules.
As educators and policymakers seek to prepare students for a competitive global economy, U.S. states have looked to other nations to inform the development or revision of their own academic content standards. International comparisons are more common in mathematics and science, subjects often linked to economic competitiveness and technological innovation, than in English/language arts or social studies. In responses to a survey from the Editorial Projects in Education Research Center, state education agency staff members most frequently cited standards from Singapore as models for current standards in mathematics and/or science.
"You can do things with 3- or 4 million people that it's very hard to do with 300 million," says Marshall S. "Mike" Smith, a visiting professor at Harvard University in Cambridge, Mass. and a visiting scholar of the Carnegie Foundation for the Advancement of Teaching. "If you make the assumption that we're not going to change the [U.S.] Constitution … the states are responsible, and they are going to remain responsible, for education."
Because constitutionally, a state, rather than the federal government, has ultimate control over its education system, it can analyze its test results in a more similar context to a country with a national education system.
"The U.S. state-to-state diversity is so huge that when you think of PISA at the national level, the aggregate just doesn't tell us very much," says V. Darleen Opfer, the director of the education division for the international research group RAND Corp. "A comparison between Finland and Vermont or Finland and Wisconsin, that might be a better comparison in terms of size, population, homogeneity of the population, socio-economic status issues. Like everything else in education right now, the disaggregation is what matters—at least it tells us more about what's going on and what we can do about it."
That's why Massachusetts and Minnesota participated in the 2007 TIMSS as independent "countries."
Former Massachusetts education Commissioner David Driscoll pushed for the state, which regularly leads the nation in NAEP performance, to participate in the TIMSS after seeing America's lackluster national performance.
"We can't just pat ourselves on the back because we have the highest NAEP scores; there are other places out there that are clearly outperforming us, and we have to find out who they are and where and what we can learn from them," says Robert Lee, the chief analyst for student assessment services in Massachusetts. State education leaders weren't "sure we were going to be at the top of the heap, knowing that the U.S. is not with the Singapores of the world, [and] wanted to get a realistic sense of where our students measured up against the rest of the world."
As it turned out, both Massachusetts and Minnesota scored well above the national and international average on the 2007 TIMSS; Massachusetts 4th graders led peers in all 59 participating countries and states but Hong Kong and Singapore in math, and Minnesota students outperformed all but Hong Kong, Singapore, Chinese Taipei, and Japan in the same subject. Hong Kong and Shanghai, like Massachusetts, participate in the TIMSS as separate "countries."
Yet the more detailed results available because Massachusetts had compared itself as a country, rather than as part of the national sample, turned up intriguing differences, Lee says. For example, the state found that while the average hours of math instruction across the United States went down between the 1999 TIMSS and 2007, it rose in top-performing countries, as well as in Massachusetts, from 141 hours per year in 1999 to 155 hours in 2007. Class size wasn't a big factor in how the state's students performed on TIMSS, but the state's math and science curricula weren't as rigorous as those used in countries like Singapore and Hong Kong, where 40 percent of students scored at the advanced level, compared with only about 15 percent in Massachusetts.
The state is doing an item-by-item comparison of performance on test questions in its math and science curricula. "I believe the kids in Massachusetts are very bright, and they could meet that [advanced] benchmark; I feel we're just not challenging those middle-level kids enough," Lee says. "[For] every item we look at, we're seeing our state on a spectrum with many other nations. You get a chance to reflect on whether your curriculum is aptly pitched for students who are going to have to compete in a global job market."
Moving to full international testing takes serious commitment from states, though. Massachusetts spent about $450,000 per grade to administer the TIMSS in 2007; it added 72 minutes of additional testing per student in grade 4 and 90 minutes in grade 8. Yet more states seem to be considering the cost worth the rewards; in the 2011 test administration, eight states, in addition to Massachusetts, participated in the TIMSS as independent "countries": Alabama, California, Colorado, Connecticut, Florida, Indiana, Minnesota and North Carolina.
Moreover, the National Center for Education Statistics is in the middle of a study that will link the NAEP and TIMSS with both a more detailed statistical analysis and the full state results. This year, states gave 4th and 8th grade students taking the NAEP some math problems linked to TIMSS, as well as an extra NAEP science assessment aligned to the international study. Then, during the later administration of TIMSS, the researchers did another linking study incorporating the same questions but using TIMSS testing procedures. This creates a pool of test questions solved by students taking both tests which can be used to align their scores.
"It's a really strong design. In the past we've and others have tried to link by using statistical techniques … but not literally giving the same test to the same kids, which is a much better way to do it," says Sean P. "Jack" Buckley, the NCES commissioner.
Yet Buckley warns, "These kinds of studies are great at producing [ranking] tables, but they're not very good at causality. International assessment data is useful at minimum for hypothesis generation, identifying top performers … [but] figuring out what works in education is really painstaking, and you're not going to get it from rankings."
Some school districts are starting to explore these finer-grained comparisons, too. For the last three years, the 4,700-student Scarsdale, N.Y., public schools has partnered with Teachers College, Columbia University, to compare its own teaching practices with those of countries that outperformed their students on the PISA. Teachers College researcher Ruth Vinz is conducting classroom observations, teacher interviews, and reviews of student work, both in the district and in Australia, Singapore, Finland, and Shanghai.
"PISA and others like it compare tests, but don't give you a sense of the curriculum that built the capacity of the students being tested," says Lynne Shain, Scarsdale's assistant superintendent for instruction. "We hope to understand the components of the teaching and learning process that foster higher-order critical and creative thinking and problem-solving."
The district found that students were not actively questioning things they were taught in class—a high developmental skill included on the PISA. The district has since changed its local benchmarking assessments to include more questions that have more than one answer, or that require students to question the information they are given.
Smith, a former top education official in three separate U.S. Department of Education administrations, says more states and districts should follow Scarsdale's lead. "PISA is specifically designed to be an assessment that requires transfer, and asks questions in the areas that you would not really pick up in the regular curriculum: They're longer, take a whole page to write about; they're the kind of questions different from the normal questions we expect on tests," he says. "In my view, we ought to be as a country creating our own PISA-like assessments if we're serious about the 21st-century skills."
Vol. 31, Issue 16, Pages 36, 38Published in Print: January 12, 2012, as Matching Up States, Countries Gives Fresh Take on Performance