In 1958, a group of international scholars met in Hamburg, Germany, and hatched an idea for a huge study to measure student learning around the globe.
They saw the world as one big educational laboratory, with each country acting as its own naturally occurring experiment. If tests could gauge the effects of those experiments, the researchers reasoned, the results might yield a bonanza on how best to teach children.
Nearly 50 years later, the project they had in mind is called the Trends in International Mathematics and Science Study, or TIMSS—one of the biggest and most influential assessment programs in the world. Yet it still hasn’t delivered on its early promise, say experts who attended a conference here this month aimed at rekindling the original vision of the program’s founders.
“It sort of became a cognitive Olympics instead,” said Judith Torney-Porta, a professor of human development from the University of Maryland College Park, referring to the country-by-country rankings for which the TIMSS reports are best known. The program, she said, “seemed to miss out on becoming a major contributor of international studies in identifying effective practices and adapting them.”
Ms. Torney-Purta, who in the 1960s took part in the development of what is now TIMSS, was among the group of international researchers who gathered for the Nov. 9-11 conference at the Brookings Institution. They shared the results of secondary analyses of data from TIMSS and other international studies, and encouraged more researchers to tap into the mounting troves of international achievement data.
An analysis of 4th graders’ mathematics scores on the 2003 Trends in International Mathematics and Science Study found that changes in test-takers’ average ages affected achievement.
*Click image to see the full chart.
SOURCE: Jan-Eric Gustafsson
“What we’ve got to do more of now are two things,” said Seamus Hegarty, the chairman of the International Association for the Evaluation of Educational Achievement, or IEA, which oversees TIMSS and other international studies. “We’ve got to ensure better, more systematic secondary analyses, and we’ve got to relate our findings to policy interests.”
At the Amsterdam-based IEA and other international-study centers, the data have indeed piled up since the late 1950s.
Just 12 countries took part in the earliest version of TIMSS, the First International Mathematics Study, or FIMS, which was published in 1967 using math data collected from 1961 to 1965. Since then, the IEA has administered at least four more cross-national studies in mathematics, science, or both, and the number of participating countries has grown with each test administration. TIMSS 2007, already under way, is expected to involve more than 60 countries.
In addition, the assessment organization has conducted cross-national studies in two other subjects, civics education and literacy. Data on student achievement are also accumulating through the Program for International Student Assessment, or PISA, a multinational study run by the Paris-based Organization for Economic Cooperation and Development.
“It’s a gold mine, really,” Jan-Eric Gustafsson, an education professor at Sweden’s University of Gothenburg, said of the TIMSS data.
So far, researchers have plumbed results from the various studies to look at how a wide range of educational factors might affect achievement.
Those factors include students’ attitudes and beliefs; variations in the size of schools and classes; students’ family backgrounds; classroom technology use; and the extent to which teachers use such approaches as group work and inquiry-driven instruction.
For instance, Elena C. Papanastasiou, a researcher from Intercollege in Nicosia, Cyprus, mined the TIMSS data archives to explore how computers and electronic calculators affect learning.
She focused on 2003 results for 8th graders from four countries —Cyprus, the Russian Federation, South Africa, and the United States—and adjusted for socioeconomic differences among students within each. She found that students who frequently used computers and calculators, both inside and outside of class, tended to score lower on TIMSS. The one exception came in the United States, where calculator use, but not computer use, was linked to slightly higher scores.
But such cross-sectional comparisons can also be limited, other researchers at the Washington conference said, citing marked variations in educational and cultural customs.
In the technology study, for instance, the results might have been skewed by a tendency among U.S. schools to have remedial students practice math on computers. But advanced U.S. math students tend to use calculators in class more often than their less-skilled peers.
Cypriot educators, in comparison, tend to discourage all students from using computers and calculators, according to Ms. Papanastasiou.
“TIMSS, in my opinion, is a lot better for hypothesis generation than for hypothesis testing,” said Gerald K. LeTendre, an education professor at Pennsylvania State University in University Park.
Age Shifts Eyed
One way to overcome the cultural and pedagogical differences across countries that hamper analyses of effective practices, Mr. Gustafsson suggested, might be to focus on the changes that occur within countries from one administration of a test to the next.
“The problem for cross-sectional analyses is that if you have a characteristic you want to measure, it tends to be correlated with a thousand other things,” he said. By looking over time within one country, he said, scholars might minimize those “nuisance” factors.
Mr. Gustafsson tested his idea with data for 15 to 22 countries that participated in TIMSS tests in both 1995 and 2003. His aim was to see if changes in students’ ages and in average class sizes within a country, from one test to the next, correlated with changes in achievement.
Mr. Gustafsson found some surprisingly large age differences. In Latvia and Lithuania, for instance, 4th graders were eight to nine months older in the 2003 assessments than their counterparts in 1995 were.
The Iranian 4th graders tested, by contrast, were three months younger in 2003.
The analysis showed that age changes were linked to achievement differences, with older students in every country outperforming their younger peers in the same grade. The relationships were strong enough, Mr. Gustafsson said, that TIMSS researchers might want to take them into account in interpreting country-by-country achievement gains—either by narrowing the testing window so that test-takers are closer in age or making statistical adjustments.
Changes in average class sizes from one test to the next, meanwhile, seemed to be important for 4th graders’ achievement and less so for 8th graders.
Most researchers, though, have focused on curricula in an effort to discern why students in some countries tend to outshine the rest of the world, including the United States, in international comparisons.
As the principal of a Finnish intermediate-level school that is arguably the highest-scoring school in the world, Maarit Rossi, another conference-goer, has fielded many such queries. Finland ranked first in math in the 2005 PISA, and the 8th graders in Ms. Rossi’s school, Kirkkoharjun School in Kirkkonummi, scored highest in that nation.
Now studying in the United States on a sabbatical, Ms. Rossi sees obvious contrasts in U.S. and Finnish textbooks. The U.S. texts, she said, are much thicker and more cluttered than the ones her students use. “It’s impossible when you have 1,100 pages of math that you get the message,” she said.
William H. Schmidt, an education professor at the University of Michigan in Ann Arbor, would agree. He has conducted comparisons of U.S. math curricula and those used by countries that consistently score high on TIMSS. As early as the late 1990s, he characterized U.S. math classes as “a mile wide and an inch deep” compared with those of the high-scoring, mostly Asian, nations.
“It’s basically, you cover everything, everywhere, because somehow, somebody will learn something somewhere,” Mr. Schmidt told conference-goers.
More recently, his analyses have also shown that the high-performing countries teach math in a sequence that mathematicians see as more coherent, and that may be even more influential in promoting students’ understanding.
Another researcher at the Brookings Institution conference, however, said Mr. Schmidt was looking in the wrong direction for explanations of U.S. students’ lackluster performance.
“Sociological theories suggest that educational systems are becoming more similar around the world,” said David P. Baker, a professor of sociology and education at Penn State. Because most countries now manage and organize schools in much the same way and teach similar content, he argued, other factors, such as students’ family background, may explain more of the test-score variations between nations than differences in schooling.
He noted, for instance, that countries that do well on the international assessments tend to be those, such as Finland or Singapore, with less socioeconomic inequality among students. Countries with wide gaps between society’s haves and have-nots, on the other hand, tend to have greater variations in their own students’ test scores.
“The notion that the world is an education laboratory is a good fantasy to push to get funding,” Mr. Baker concluded. As schools become more and more similar around the world, he added, the possibility that researchers can distill best practices in education from international achievement is becoming more remote.
But the University of Maryland’s Ms. Torney-Porta said, “We haven’t had really very many problems doing secondary analyses,” referring to studies that she and colleagues had done using TIMSS data on students’ civics achievement. “I think there’s enormous potential.”
Coverage of education research is supported in part by a grant from the Spencer Foundation.
A version of this article appeared in the November 29, 2006 edition of Education Week as Potential of Global Tests Seen as Unrealized