Military Experimenting With Computerized Aptitude Tests
Military researchers are experimenting with a computerized version of a group of vocational-aptitude tests administered to about 800,000 armed-services personnel annually.
The new system, which requires test-takers to sit at computer terminals and answer questions electronically, is likely to revolutionize the whole standardized-testing field, making current paper-and-pencil methods obsolete, those working on the idea believe.
Among them, in fact, is the nation's largest test-making organization, the Educational Testing Service (ets) in Princeton, N.J. Its researchers also are experimenting with computer-based tests, as are researchers at the University of Minnesota, although both groups acknowledge that the military effort has a "head start."
'Computerized Adaptive Testing'
The military's initial goal is to develop the "computerized adaptive testing" (cat) system for the Armed Forces Vocational Aptitude Battery, a paper-and-pencil test used in placing military personnel in various occupational specialities.
The paper-and-pencil approach forces the testers to gear the test to the middle range of ability, explains Michael Wiskoff, a psychologist who is program director for per-sonnel and occupational measurement at the Naval Personnel Research and Development Center (nprdc) at San Diego. As a result, he says, the test is too hard for those at the lower ranges of ability and too easy for those of higher abilities and thus does not yield an accurate picture of the characteristics of all the test-takers.
As conceived by the Navy and other researchers, computerized adaptive testing tailors each test to the individual taking it.
The test begins with an item of medium difficulty, Mr. Wiskoff explains. If that item is answered correctly, the computer selects a more difficult item for the next question; if the first item is answered incorrectly, an easier question follows. The computer continues to adapt the test to the abilities of the individual throughout the test.
The result, says Mr. Wiskoff, is a test that gives the armed services a more precise look at the varied abilities of those who take the battery's various subtests--verbal, mathematical, and mechanical.
The Navy research organization started working on the cat system in 1977 at the request of the commandant of the Marine Corps. He directed the center to explore the implications of cat and to develop a cat version of the vocational-aptitude battery.
Because of the system's obvious implications for all of the armed services, the Marine Corps requested and received the support and involvement of the office of the Secretary of Defense. A joint meeting of the services in 1978 gave the Navy lead responsibility for developing the system. The Air Force is developing groups of items to be included in the test, and the Army has responsibility for procuring a "delivery system" (meaning the terminals and hardware necessary to give the tests).
The effort is guided by a coordinating committee made up of representatives from each service.
The new testing procedure, still in the developmental stage, is made possible by advances in testing theory and computer technology, according to Mr. Wiskoff. Initial trials indicate that the test could be administered via either a central computer connected to many terminals or a single microcomputer, thus making it feasible for individual schools.
Navy behavioral scientists have been working on the system for about four years, says James McBride, a psychologist who is the cat project director. The system is expected to be fully set up for the vocational-aptitude battery by 1984 or 1985, Mr. Wiskoff said.
At this point, the military is far ahead of other developers of large-scale tests, according to David Weiss, professor of psychology at the University of Minnesota. Mr. Weiss, who has been working on computerized adaptive testing for about nine years, is regarded as one of the leading authorities in the field.
The testing system has important implications for civilian testing in education, business, and industry, Mr. Weiss contends. Computerized adaptive testing promises to be superior to paper-and-pencil tests in almost every way, he believes.
"I predict that by the year 2000 computerized adaptive testing will have driven paper-and-pencil tests into the museums. We'll tell our grandchildren, 'That's the way I used to take tests."'
So far, the Navy researchers have tried out the new test method on thousands of Marines in San Diego. The Marines, according to Mr. Wiskoff, have adapted well to computers as test instruments.
The content of the test itself has not been evaluated. Nor, at this stage, have cost comparisons been made between the computerized system and paper-and-pencil tests. But Mr. Wiskoff is convinced that the computer system will be cost-effective and competitive with the pa-per-and-pencil batteries.
The computer system reduces the cost of administering the tests, Mr. Wiskoff says, because it eliminates expensive hand-scoring and evaluation. Even where tests are machine-scored, they must be sent away for scoring, Mr. Weiss notes. The cat system will provide results that are quicker and more accurate, since hand-scoring is subject to human error, the two researchers agree.
In addition, Mr. Weiss suggests, computerization can reduce the time of administration of tests by about half, since a smaller number of items are needed to get an accurate picture of a person's ability or aptitude.
That factor alone, he argues, could yield substantial savings in the military, where the vocational-aptitude test now takes four hours to administer. Other mass-administered educational tests, such as the Educational Testing Service's Scholastic Aptitude Test (sat), also take several hours.
The electronic testing method, adds Mr. Wiskoff, will also reduce costs associated with redesigning test forms and replacing individual test items--a problem that now costs test-makers a significant amount of added effort and money annually.
Educational Testing Service researchers have been trying out the cat system on high-school students. In one verbal-aptitude test, ets found that accurate results could be obtained with 15 to 20 questions, compared to 50 to 80 questions on conventional tests, said Ernest Anastasio, vice president for research and development.
With the cat system, new items can be evaluated by including a few experimental items at a time in an actual test. The experimental items would not be scored, but they could be evaluated for their appropriateness, Mr. Wiskoff suggests.
"A U.S. Civil Service Commission study estimates that cat can actually be less costly than paper-and-pencil testing overall, when one takes into account the cost associated with data collection, test development, and printing," according to the researcher.
In addition to being less costly and less error-prone, computerized tests are much more secure against theft and compromise, he argues. Since military recruiters are paid for the number of qualified recruits they attract to the armed forces, they now may be tempted to pirate the tests and coach the recruits, he says. But because no versions of the computerized test need be stored, it would be difficult to steal the tests.
Moreover, because the versions of the computerized test being now being developed offer such a large number of items within the system--because each individual is tested on a unique combination of these items--coaching would be very difficult. Each test-taker and proctor will be exposed to only a limited number of the total number of questions that could turn up, Mr. Wiskoff notes.
The virtues of the system that appeal to the military make it equally sensible for a civilian setting, according to Mr. Weiss. For example, he says, under the branching system of harder and easier questions, the sat's could be refined to offer more precise measures of all levels of ability, rather than just that of the "typical" college applicant. Only a few items in current tests are geared to the superior student, he points out.
Mr. Weiss estimates that the number of sat questions could be reduced by two-thirds and still yield better measurements of academic ability with the use of the cat system.
ets's Mr. Anastasio agrees with Mr. Weiss that computerized adaptive tests can make mass testing more efficient and more accurate. And he foresees important applications in the course-selection and placement areas, especially at the community-college level.
Community-college students often "walk in off the streets with little or no evidence of their aptitude and abilities," Mr. Anastasio says. The colleges need ways to place such students at appropriate levels of mathematical and verbal ability, and a well-designed computerized test could give that measure of ability in 40 or so minutes, he believes.
ets is developing such tests, which it hopes to start franchising to community colleges in New Jersey next year, according to the official.
Beyond the community-college effort, Mr. Anastasio adds, ets will continue to explore the implications of the technique. "Lots of questions have to be answered," he feels, before the cat approach can be adapted for use in the schools or for national tests like the sat series.
But eventually, Mr. Weiss and other researchers predict, computerized adaptive tests will be used to gauge student achievement in the classroom. Mr. Weiss is developing such a test for his introductory-psychology course at the University of Minnesota.
He believes the cat system eventually could do away with multiple-choice questions, thus eliminating the guessing from tests. Computers can handle a certain amount of text, he explains, so "free response" types of questions could be asked.
With computers, he argues, testers can devise more varied and elaborate measurements than are now possible in standardized-test formats. "We can hook a microcomputer to a videodisc," which has 55,000 frames of information, that could incorporate tests involving "movement and perceptual kinds of things. People could be tested on their reaction to situations or on their social intelligence."
Army behavioral scientists actually have developed a system that links computer and videodisc technology for leadership training. Officer trainees at Fort Benning, Ga., are presented with interpersonal problems on a video system along with three to five possible responses, says Frederick Dyer, one of the developers of the system.
The officer chooses one response and the computer selects another video sequence showing a noncommissioned officer or private reacting to the officer's response.
"It's a dramatic thing," Mr. Dyer says. "People definitely find this system stimulating, challenging, and entertaining."
Mr. Weiss contends that "testing will be quite different a few years from now. Tests will look more like games in a few years and taking tests will be a lot more fun."