Investing in Teacher Tests: Issues for the States Are Complex
As states move to test the skills of prospective teachers, their officials have become caught up in complex testing questions for which there appear to be no easy answers.
Whether they decide to rely on prepackaged national examinations or to create their own tests with outside help, the investment is a substantial one. But expert opinion is divided, not only on which type of test is best, but also on how a test should be constructed to measure what it is intended to.
The complexity of the decision faced by state officials--who are often under legislative mandates to move rapidly--is also heightened by what appears to be a growing rivalry for clients among test makers, whose wares differ significantly.
State officials' dilemma was highlighted during a two-day meeting late last month in Chicago--which included representatives from 13 states that are using or considering the use of teacher examinations--and in subsequent interviews with state officials and testing experts.
The Chicago meeting was sponsored by National Evaluation Systems Inc., a for-profit education-research firm that has6worked with a number of states to develop customized teacher tests.
It and other firms have created new competition for the National Teacher Examinations, the series of assessments developed by the Educational Testing Service that are now used in some 27 states. Some assert that the ets is working to modify its tests in direct response to the newer rivals.
"The choice of tests to use is one that many states are wrestling with right now," said Joyce R. McLarty, assistant commissioner for school success in Tennessee. "From my point of view, things are still milling around a great deal. I don't think it has settled down to the point yet where we have a clear sense of what the best approach is going to be."
The cost of creating a test from scratch averages about $70,000 per subject area, according to Ms. McLarty. And some states are designing as many as 65 to 80 different exams for individual certification areas.
In contrast, the prepackaged national assessments that many states use--primarily, the core battery and subject-area tests of the nte--are "a lot cheaper," according to W. James Popham, professor in the graduate school of education at the University of California-Los Angeles and founder of iox Assessment Associates, the firm that developed the Arkanasas teacher test. "But just because it's cheaper, that doesn't mean it's a good thing to do," he said.
Some, including Mr. Popham, who are critical of the nte argue that "criterion-referenced" tests--which measure the test taker's mastery of information--are much more suitable for teacher testing and diagnosis than "norm-referenced" tests--which compare individual performances to those of the whole test group.
According to Mr. Popham, the nte exams are norm-referenced tests designed for the purpose of comparison and unsuited for current licensing needs.
Mr. Popham claims the nte does not do a good job of assessing well-defined skills that educators need to possess in order to function as teachers--a charge that ets officials deny.
"Particularly in this era of litigation, where one would have to demonstrate the patent relevance of the test to what goes on in the classroom," contended Mr. Popham, "it would be very hard to argue that the nte would work out okay."
Starting From Scratch
In contrast, most customized state tests are criterion-referenced instruments that measure specific areas of knowledge or skill against set "criteria" or a "cut score."
A number of states break out subscores on individual test objectives--such as reading, writing, and mathematics--on a statewide basis to determine the areas most in need of remediation across the entire pool of test takers.
Some testing experts, such as Lester M. Solomon, director of teacher assessment for Georgia, maintain that unless a criterion-referenced test is used, the test results will not provide the individual or the state with enough information to design remedial programs for those who fail--an important aspect of any teacher-testing program, he argues.
For years, the nte has only reported a total test score to test takers and has not included subscores for various objectives.
But this fall, for the first time, the Educational Testing Service is planning to tell people who take the core battery how many questions there are in each section of the test, and what percentage of those questions they answered correctly. Similar data also will be provided to the colleges and universities the prospective teachers attend.
And next spring, the same kind of information will be available for the ets-operated preprofessional-skills test, which some states require for admission into teacher-education programs.
According to Robby Champion, a specialist in teacher education with the Maryland Department of Education, the ets is changing its reporting format in response to "market demands." The ets "realizes it must compete with other testing companies, and it is becoming more responsive," she said.
Marlene Goodison, a program administrator for teacher programs and services for the ets, cautioned, however, that "unless a test is very lengthy and very comprehensive, it is not a diagnostic test."
"You can tell people about their performance on the test," she said, "but generally speaking, the content sampling of the predictive areas in the test is not adequate to be what one would call a diagnostic test." She warned that such test results should not be "overinterpreted" in developing remediation programs.
Ms. Goodison also maintained--contrary to others interviewed--that the nte tests are criterion-referenced. States that use an nte test set their own absolute cut-off score to determine whether an individual has passed or failed the exam, she said, and do not judge the test taker's status in relation to all others. Although the ets has reference-group data for all the people who have taken the nte, she added, the tests are not nationally normed.
Question of Ownership
But even if all agreed with this argument, states still may have strong reasons for creating their own tests, according to some of those interviewed.
"The concept that customization made them our tests was extremely important," said Robert Gabrys, director of the office of educational personnel development in West Virginia, which is in the fourth year of custom-developing its teacher assessments.
West Virginia's teacher tests were designed to reflect "learning outcomes" in each subject area for public-school students in grades K-12.
"That's probably one of the strongest selling points in gathering support and resisting criticism about why we're testing," said Mr. Gabrys. "We can trace learning outcomes in schools to objectives in our teacher-training curriculum to test objectives and items."
People across the state offered advice on the test design, he said, with the result that "the decisions are ours and the ownership of the tests is also ours."
To use tests like the nte, states must also "validate" the exam--a process that ensures that each test question measures what it is supposed to, reflects knowledge that teachers in that state really need to know, and consists of information that the test taker had a previous opportunity to learn.
Validation is particularly important when using nationally developed tests at the state level, lawyers emphasize, because questions that are appropriate nationally may not be appropriate in a particular state. And too many such questions make a test unfair at best and useless at worst, they say, opening the door to potential lawsuits.
"One can almost make a prediction that anyone who gives a major competency test of any type" is going to be sued, said Michael A. Rebell, a lawyer who has worked extensively on teacher-testing issues. The best educators can do to anticipate future legal action, he said, is to execute the "most sound, valid psychometric judgments possible."
Good validation studies, according to some state officials, can cost almost as much as developing a test from scratch.
In addition, when a state validates a national test such as the nte core battery, it may find that as many as 20 percent of the test items are not appropriate. State standards for scoring the tests then have to be readjusted to correct for the presence of invalid items. If too many items on a given test are invalid, the test cannot be used.
Ms. Champion of the Maryland education department pointed out, however, that there are some advantages to relying on a national exam that is used in different states.
The ets has greater resources and expertise than are available to most states, she said. Use of the exams in numerous locations is also attractive to teachers who may be interested in moving from state to state.
Donald V. Watkins, an Alabama lawyer who is involved in a four-year court battle over that state's teacher tests, said that "most state departments of education do not have the technical expertise in- house to develop their own competency-testing programs." Nor is there a regulatory body to monitor what commercial testing companies do, he said.
Recently, some educators have discussed the idea of states working together to define a list of test objectives and develop test items that individual jurisdictions could thenadapt at a reduced cost. Such an approach would save states from "reinventing the wheel," they say.
Last month, for example, Georgia agreed to allow Florida and the District of Columbia to use items and objectives from its teacher tests as a basis for developing their own. The two jurisdictions will pay Georgia $5,000 for every subject-area examination they use.
Joint efforts could be particularly useful in developing low-volume tests--such as those to license Japanese-language teachers--for which it is hard to recoup the cost of development from examination fees.
Several of those interviewed argued that this approach is more realistic than the concept of a national teacher examination, which would require states to relinquish some of their own authority and control.
"From a practical perspective," noted Michael L. Chernoff, director of marketing for National Evaluation Systems Inc., "by the time there is a national test, a lot of the states will have implemented tests themselves."