Can Head Start Pass the Test?
The proposed Head Start reporting system has all the characteristics and dangers of a high-stakes test.
The U.S. Department of Health and Human Services recently announced a new "national reporting system" for Head Start that calls for preschoolers nationwide to be given twice-yearly achievement tests beginning this coming fall. Designed to supplement the "outcomes framework" implemented two years ago, the system will require more than 500,000 hours to implement, at a cost that has yet to be determined. Its purpose is to improve program monitoring and guide technical-assistance efforts and transitions to school.
These goals are laudable. The question is whether the means the department has chosen to reach them—a system of achievement tests for very young children—will succeed. If not, I fear the system may end up harming Head Start rather than improving it.
Plans for the reporting system call for individual assessments of each of the half-million 4- and 5-year-olds in Head Start in the fall and spring by their more than 30,000 teachers. The teachers will receive a brief training in administering the assessment this summer. The proposed tests are drawn from existing instruments that assess children on a number of performance measures, including recognizing a word as a unit of print, identifying at least 10 letters of the alphabet, associating sounds with written words, and so forth. These indicators were incorporated into Congress' reauthorization of Head Start in 1998.
All of this sounds reasonable enough. Head Start, staffed primarily by teachers who are poorly paid and not uniformly well-trained, has had an uneven record of quality since its inception in 1965. Hardly anyone can argue with the need for public programs to be held accountable, and testing today is the coin of the educational realm. So, what's wrong with testing Head Start children?
Here's what's wrong: Though not labeled "high stakes," the proposed plan has all of the characteristics and potential dangers of a high-stakes test. Research demonstrates, for example, that the labeling that accompanies high-stakes tests can have a long-term impact on teachers' perceptions of children's ability to learn; can result in stigmatizing children and tracking them into low-achieving groups; and can make a long-lasting impression on children's self-perceptions, estimates of their own abilities, and motivation and achievement. These consequences are very real for young children.
Another danger of the proposed reporting system lies in the narrowness of its content. The proposed test covers the congressionally mandated indicators, but it omits a huge portion of what is taught and learned in high-quality Head Start programs and other preschools, including appreciation for books and reading; comprehension; early writing; numbers and operations; geometry; measurement; scientific knowledge, skills, and methods; and anything having to do with social-emotional development, social studies, the arts, and physical growth and development. The reason content is important is that high-stakes tests, by means of "measurement-driven instruction," have a powerful impact on what is taught and what is learned.
In short, as research with older children suggests, the new reporting system can change Head Start by narrowing its focus and altering what is taught and learned. This narrowed focus may further endanger the very children Head Start was meant to help: children who are developmentally at risk. The tremendous diversity of the Head Start population will be hard-pressed to conform to a single vision of what young children should know and be able to do at a particular point in time. Instead of eliminating the failures of the Head Start program, this system may teach young children to view themselves as failures, simply because they see things differently from the way the test developers do, or learn skills in ways that differ from the statement of goals incorporated in the congressional indicators.
To imagine that achievement tests can be imported into the lives of 4- and 5-year-olds without negative consequences for children, families, and teachers is unwise at best and irresponsible at worst. Even President Bush's "No Child Left Behind" Act of 2001 does not mandate testing until 3rd grade. We know that conventional achievement tests are flawed. Research demonstrates that no more than 25 percent of early academic or cognitive performance is predicted from information obtained from preschool or kindergarten tests.
Fundamentally, high-stakes testing is inconsistent with meaningful learning in early childhood. This is a time of dramatic developmental change, a critical period of transition from home to school, and an interval of heightened sensitivity to socialization, openness to exploration, and trying- out of the self in relation to others. It is not a time to highlight failure or to impose narrow views of learning and achievement. High standards are appropriate. High stakes are not.
But the solution is not to reject testing and assessment. We are living in a policy climate that is committed to accountability in public educational programs. Indeed, high-quality teaching calls for high-quality assessment—the two are inextricably linked.
To improve teaching, we need comprehensive, classroom-based evidence about what children are learning that can be translated easily into meaningful instructional strategies to enhance teaching and improve learning. Systematic, well-researched, observational assessments, whose results can be aggregated across programs, can accomplish this.
If a national reporting system is implemented, it must be built around a matrix sampling plan. This type of design is used in the National Assessment of Educational Progress and other large-scale assessments. It involves giving parts of tests to individual children, but giving no single child all of the items on a test. Statistically, it is then possible to construct composites that tell us precisely how well children in specific sampling units are doing.
A matrix sampling plan lowers the risk of teaching to the test, because it makes it difficult to know which items will actually be administered. It is also very efficient and effective for demonstrating program effects.
Since a matrix design requires substantial lead time to develop, a field trial should be mounted in the fall, rather than rushing to put the whole system in place within the next six to eight months. This would not only provide more opportunity to train teachers, figure out the logistics of testing hundreds of thousands of children, and develop the alternate forms of the test required for matrix sampling, but it would also enable the government to try out and validate all of the subtests that have been proposed for the test, some of which have never been used with children this young or with children enrolled in Head Start.
A more extensive field trial would provide time to learn how to deal comprehensively with questions of how English- language learners and children with disabilities will be engaged in this effort. Moreover, it would give Congress an opportunity to provide oversight on this important and potentially policy-changing initiative—something that has not yet occurred.
If we can move ahead on adopting a matrix sampling design for the proposed reporting system; if we can ensure that the system is composed of subtests that are reliable, valid, and fair; and if we can have adequate time to learn how to mount this historically largest-ever effort to test young children without creating chaos and confusion, then we will have created a system that has a chance of assisting young, at-risk children.
Otherwise, I fear not only that Head Start will fail the test; it may not survive it.