|With students' academic futures, educators' careers, funding for schools, and efforts to increase accountability in public education on the line, guarding the security of state tests has become more important than ever.|
Ellicott City, Md.
Which of the following is a violation of test security?
(A). Examiner (to student using calculator): "You need to punch in the other number first!"
(B). Examiner to student dictating a response: "Don't you want to say anything else? Why did the ___ look like that?"
(C). When reading directions to the students, the Examiner substitutes a few words for what is written in the directions because she never used those words in class.
(D). All of the above.
The correct answer is D.
Although some of these scenarios produced a few guffaws among teachers at a recent Maryland workshop on testing, test security these days is no laughing matter.
Increasingly, states use the results from standardized tests to reward and punish students, educators, and schools. The scores can help determine whether students graduate, teachers and principals receive salary bonuses, or schools get shut down.
But as the stakes increase, so do concerns about security. In the past few years, Kentucky, Rhode Island, and Texas have had to invalidate scores for individual students or schools--or, in one case, suspend an entire test administration--because of suspected cheating or breaches in security.
"It's an area of great, growing concern," says Wayne Martin, the director of the state education assessment center at the Council of Chief State School Officers. "Most of the states are looking very carefully at what exactly are the security issues and what can they do to reinforce them."
In May, for example, the Texas legislature passed a bill that makes it a felony to alter tests or test results. Those found guilty could spend up to 10 years in prison and pay up to $10,000 in fines. States such as Kentucky and Rhode Island also have tightened security and revised their codes of ethical testing practices.
"Clearly, the problem of cheating on tests is becoming a much larger problem than it's been in the past," says Walter Haney, a professor of education at Boston College and a senior research associate at the Center for the Study of Testing, Evaluation, and Educational Policy. "In the past, almost all cheating on tests was reported to be done by students," he says. "Now, there are increasing numbers of examples of cheating on tests not by the students, but by teachers and administrators because of the stakes associated with the test results."
To examine what states are doing to ensure the integrity of their testing systems, Education Week followed the spring 1999 administration of one state test, the Maryland School Performance Assessment Program. Although Maryland's procedures are not mirrored in every state, they reflect a general trend.
Since 1993, Maryland has used scores on the MSPAP, pronounced "miss-pap," to help judge the performance of individual schools. Results help identify low-performing schools that are eligible for assistance and, in the worst cases, a complete overhaul known as reconstitution. Schools that improve significantly also can earn monetary rewards.
The results are not used to evaluate individual students.
Children in grades 3, 5, and 8 take the exams in reading, writing, language usage, mathematics, science, and social studies over a period of five days in May for each grade level. Unlike many state tests--which rely primarily on multiple-choice items--Maryland asks students to work in small groups or pairs to conduct experiments, collect data, and apply their knowledge from multiple disciplines. Students then answer questions and write extended responses individually.
CTB/McGraw-Hill, a commercial test publisher based in Monterey, Calif., edits and publishes MSPAP test items and leads the analysis of test data. The actual questions are written by Maryland teachers, who also score the exams.
In April, about three weeks before administration, CTB/McGraw-Hill ships boxes of materials from a printing plant in Texas to individual schools and district offices by truck. Shrink-wrapped inside each box are resource books for students, survey documents, and answer books. There are also highly scripted examiner's manuals for administering the tests, including what to say and do and how much time to allot for each task. Every resource book, answer book, and examiner's manual has its own serial number and bar code.
Because MSPAP is a performance assessment, the exams also come with boxes of tools and "manipulatives," from plant seeds and other "live materials" to test tubes and measuring instruments.
Maryland requires each of its 24 school systems to assign a local accountability coordinator whose job is to oversee test administration and security. Each school also has a test coordinator--typically the assistant principal, a reading specialist, or a school counselor. One of the coordinator's primary tasks is to unpack and inventory the testing materials as they arrive.
At Longfellow Elementary School, a cheerful, red-brick building on a quiet street here in mostly suburban Howard County, about 25 miles west of Baltimore, the job falls to Assistant Principal Pamela Cullings, who keeps the boxes locked in her office. The key dangles from a red ribbon around her neck.
One afternoon in mid-April, when she's ready to do the inventory, she lugs the boxes into a room down the hall marked with a large sign: "Test materials. Please keep door locked!"
There, she matches the sequence numbers listed on the packing list with the bar codes on each booklet, signing her initials as she goes. If there's any discrepancy between the secure materials received and the numbers that appear on the packing list, Cullings will flag it on a special form to notify both the test publisher and the district.
Because MSPAP asks students to participate in lengthy and complex performance tasks, many of which stretch over several days, no student takes the entire exam. Instead, three different "forms" or portions of the test are given in each grade to different, randomly assigned groups of students. A large school may have several groups of youngsters taking each form of the test.
For Cullings, whose 430-student, K-5 school has four testing groups per grade, that means there are multiple versions of answer books, resource books, examiner's manuals, and manipulatives to keep straight. There are also large-print and Braille versions of the exams.
For each testing group, Cullings fills out a security checklist that includes the number of materials received along with their serial numbers.
Next, she pulls out a large box of tools and manipulatives. For each examiner, she'll prepare a little brown paper bag for each testing day that includes the materials needed for that morning, such as measuring cups, containers, trays, and rubber bands. She scribbles notes on the bags about any additional materials the examiner must supply, such as calculators or index cards.
"It's like baking a cake," she says, as she sorts the boxes of materials into slowly accumulating stacks and bags. Make that a layer cake, with separate fillings for each layer, and you get the idea.
In January, each Maryland district also sent CTB/McGraw-Hill a computer tape with the name and birthdate of every student who was scheduled to take the exam. CTB assigns each student a unique identification number and prints out three labels. Before exam time, Cullings or an assistant will attach one label to a survey document to be completed by the student, which provides background information about the instructional practices in the school.
A second label goes on the student's answer book. The third label is kept at the school as a record of who is tested.
At the beginning of each testing day, the teachers pick up their materials, count them, and initial a daily log to verify that they've been received. At the end of each day, they return the materials to Cullings, who again counts the materials, rechecks the sequence numbers, initials the daily log, and locks the log and supplies in her office.
After all testing is completed, Cullings will inventory the materials one last time before boxing them up and shipping them back to the publisher. "I do a lot of counting," she sighs.
"Nine times out of 10, it's going to be right," she says about the inventory. "But if it's not, you're looking for something that you never got."
Such painstaking inventories are a common way of preventing or catching cheating. In April, officials at two Dade County, Fla., high schools uncovered a case of suspected cheating after a daily count of test booklets revealed that one book was missing. Though the missing document later turned up, suspicions were raised when some students changed their previous answers on a math portion of the exam that had not yet been graded.
About two weeks before the MSPAP exams, Cullings holds a test-orientation session for all of the examiners and proctors in her school, including the aides who will work with students with disabilities.
She reviews the procedures for checking out and returning test materials; reminds teachers to keep track of which students are absent on testing days; and walks teachers through the test-security exercise, which includes 16 possible scenarios that could occur during MSPAP administration.
She also reviews some of the rules on accommodations for special-needs students. For example, if a student dictates his or her responses, the answers must be recorded on tape as well as transcribed, to ensure that the examiner has not added or altered any words.
"In their heart of hearts, most teachers know what's right and wrong," says Leslie Wilson, the supervisor of testing for the 41,000-student Howard County schools. "Where the problem often comes in is when people are providing the accommodations to special education students, and the kid starts to stumble or struggle. That's when you have the tendency to want to help."
At Longfellow Elementary's orientation session in late April, a few first-year teachers ask questions, but most seem familiar with the procedures. David Casey Hickenbottom, a 3rd grade teacher, says he's gotten used to the constraints he is under during testing time. "You just can't say anything," he says. "It's the one time of the year when all I can say is, 'I'd love to help you, but I can't.' "
After the session, the participants must sign a form verifying that they've been trained. Howard County adopted that procedure a year ago, after a few teachers who committed testing violations claimed they had never been told the practices were inappropriate. Starting next year, the state plans to replicate the practice statewide.
The Friday before exam week, teachers all over Maryland scour their classrooms, removing materials that could inadvertently aid children during testing. Teachers are allowed to leave up guidelines for writing, conducting scientific experiments, or other tasks, as long as they have been in use all year. In Longfellow's bustling, well-organized classrooms, nearly every surface is covered with such teaching aids.
On D-Day, Cullings is on the phone early, trying to persuade a parent whose child has allergies to send the student to school just for the testing period. Otherwise, the child will receive a score of zero for the day, which will count against the school's overall performance. In general, says Wilson, "if you're going for a diploma, if you're taking the regular curriculum, you take the test."
Before heading out into the hallways, Cullings stops by the photocopier to grab a stack of neon-yellow signs that she'll post outside each testing area: "Stop, students, teachers, and parents. Please do not enter the testing area at this time. Report to the office first."
During the tests, the students' desks must be cleared.
In each room, a proctor--in addition to the testing examiner--walks up and down the aisles. Except during the preassessment activities, such as setting up an experiment, the teachers' comments to students are minimal, confined to such generalities as "try your best," or "read it carefully."
Ellen Murphy reminds her 5th graders: "You're encouraged to highlight in the resource book and in your answer book. You need to make sure you put effort in your test completion. That means answer every question. If you don't understand something, the only thing I can tell you is to go back and reread the activity."
At the end of the day, teachers hand out stars and candy for good test-taking behavior. "I was an examiner last year, and by day five, you're just drained mentally," says Stephanie Ketner, a special education resource teacher who is serving as a proctor this year. "The kids are at their wits' end also. They've put all that effort into it. So by the fifth day, it's been a long road."
On the last day, all the testing materials are inventoried again, packed up in separate envelopes, and boxed. The trucks swing back into town. And the boxes head cross-country to CTB/McGraw-Hill's headquarters in Monterey.
The company's main building, perched on a hilltop overlooking the Monterey Bay, is the length of two football fields. Security guards and cameras monitor the exits, and visitors cannot pass anywhere unescorted.
CTB/McGraw-Hill is the testing division of a publishing giant, the McGraw-Hill Companies Inc. In addition to publishing state assessments for 22 states, including New York, Colorado, and Indiana, the company produces well-known off-the-shelf tests, including the California Achievement Tests, the Comprehensive Tests of Basic Skills, and the 2-year-old TerraNova program.
In the rooms where tests are scored by hand, readers must sign a confidentiality statement. Their handbags and backpacks are stored in lockers outside, and random bag searches are done as readers enter and exit the building. At least twice a year, the assistant director of corporate security, a big, burly ex-detective from New York City, drops in unannounced to conduct spot checks to ensure that procedures are being followed.
When huge stacks of shrink-wrapped boxes arrive from the company's warehouse in Salinas, every box is assigned a tracking number.
Then, the boxes are unpacked and the envelopes are put on wheeled carts that are sent to the next room to be inventoried.
There, workers scan the bar codes for the returned materials into a computer database. Documents, such as resource books, that are not scored are reboxed and sent to the warehouse for storage. The survey documents, with the students' biographical information, move on to a room where large machines scan the information into the computer database at 9,000 pages an hour. The information is stored on magnetic tapes, which are loaded onto the carts with the survey books.
In the next room, data processors review the information. For example, the student identification numbers must match those sent by the state, and all of the bar codes must be valid. The data from the survey documents will be used to produce scoresheets for Maryland teachers to fill in when they hand-score the answer books later this summer. The score sheets will allow the scores to be included in the statistical analyses the state will use to prepare its reports.
Employees insert the score sheets into the answer books and cover each student's name and school with an opaque label to prevent possible bias when Maryland teachers score the exams. The books are then "randomized" so that readers will be scoring tests from multiple districts and schools. The materials are checked once again, then boxed and shipped to Measurement Inc., a Durham, N.C., company that helps Maryland administer the scoring process.
Once teachers score the answer books by hand, employees at Measurement Inc. will scan the results into the computer, edit the data, send the information back to CTB electronically, and ship the answer books back for storage. CTB stores the materials for one year before they are destroyed.
"You try to build in a lot of checks up front to prevent problems from occurring," says Doug Hartman, CTB's senior director of scoring services. "And then, you try to build in contingency plans for when problems do occur."
Despite the tight security surrounding many state tests, problems do arise. Given the thousands upon thousands of students tested each year, the actual number of such cases is small. But they are the bane of state assessment programs.
- In January, Texas required 11 of the state's 1,042 districts to investigate possible cheating or test tampering after a computer analysis spotted 33 campuses with a higher-than-average number of erasures on student answer booklets from 1996 to 1998. Although five districts found no wrongdoing, a handful of principals and teachers in the other districts have been suspended or asked to resign.
In April, a Travis County grand jury indicted the Austin district and a deputy superintendent on misdemeanor charges for allegedly tampering with test data. Another district employee pleaded no contest to charges of altering government documents.
- In March, Rhode Island hastily halted its statewide English and math tests when officials learned that some teachers in 21 districts had kept copies of last year's exams and used them to prepare students for this year's tests, which contained the same questions. Teachers said they did not know the tests would be reused. The state resumed testing, with an alternative form of the exams, in May.
- Massachusetts officials are looking into more than two dozen reports of possible test violations, following the spring administration of the Massachusetts Comprehensive Assessment System. State Commissioner of Education David P. Driscoll said that, next year, changes will be made in the testing schedule to minimize security breaches.
Officials in many states say incidents of cheating or problems with test security are comparatively rare. "Most of the time, we're finding that people are just trying their doggone hardest to do what's right and get through it the best they can," says Robyn Oatley, the director of community relations for the Kentucky Department of Education. "The majority of the allegations that we get are simple human error."
But no good national data are available on the frequency of outright cheating or questionable testing practices. And at least some experts say problems with test security are more widespread than many people want to believe.
"Is the whole basket of questionable practices--including those in the big gray area that is not frank cheating--substantial enough that it's likely to distort gains in high-stakes testing?" asks Daniel Koretz, a professor of education at Boston College. "The answer is yes. If I had to put money on it, I would bet, based on the scarce data we have, that the gains will be markedly inflated."
The high stakes now associated with state tests are a factor, says James L. Parsons, the coordinator of assessment and evaluation for the 25,000-student district in Humble, Texas, a Houston suburb. "Something has happened around Texas, and probably around the country, when you have superintendents' salaries and bonus incentives" based on passing rates on state tests.
In Maryland, a Test Security Council--which includes the state education department's legal counsel, an assistant state superintendent, and the director of test security--meets once a month to review possible violations.
Sanctions for educators can range from a letter of warning or a reprimand to a suspension without pay, or even dismissal. The state can also suspend or revoke a teaching certificate and impose monetary penalties on a school system if, for example, the violation requires the state to spend money developing new assessment tasks.
The state also monitors the testing process. Since 1995, it has audited student answer books in cases where scores look suspiciously high.
For an audit, the answer books are sent back from California, and officials in the education department look for identical or near-identical responses. For the 1997-98 test administration, the state audited about 2,500 books out of about 150,000 total.
The last time state officials actually found a problem was in 1995. "People are pretty well aware of the audit throughout the state," says Kathleen Rosenberger, the director of test security for the Maryland education department. She argues that it acts as a deterrent.
The state also conducts a yearly audit of any schools in which 40 percent or more of the students receiving special education services are slated to be exempted from the exams. Last year, the state audited fewer than 30 such schools.
This year, Maryland also began auditing tests of students who receive special accommodations during testing, such as having written materials read to them. "What we're looking for in those cases is that the child is receiving those accommodations on a daily basis during the school year, and not just for the test," Rosenberger says.
During MSPAP administration, the state also sends out teams to schools that have had infractions in the past and to other schools on a random basis to observe test administration.
Since 1992, the state has investigated 173 possible violations of testing practices. Of those, 67 percent--or 115--were found to be true violations. The most common problems--about 37 percent of the cases--involved "improper assistance," such as children copying from each other or receiving help from a teacher.
About 10 percent involved missing test booklets. And 5 percent involved books or other testing materials that were not returned on schedule. Other violations include test materials that weren't stored properly, test items that were divulged publicly, or disorderly test administration. When violations are found, scores are invalidated.
"We have had situations where there have been suspensions without pay from two days to one and a half years," Rosenberger says. "We've had some cases where the [teacher's] certificate has been suspended, which means you cannot work." But those instances are rare, she adds. "In my judgment, out of the 3.85 million tests that we've given since January 1992 through today, 173 is a really small number."
Other states are clamping down as well. In June 1997, Kentucky's education commissioner created a division in the education department to handle allegations of testing violations, after a series of articles in the Lexington Herald-Leader suggested that the state had been lax in investigating potential problems. This spring, in preparation for a new statewide test, Kentucky revised its code of ethical testing practices.
In South Carolina, violations of test security are a misdemeanor, punishable by up to a $1,000 fine and 90 days in jail. Any investigations are handled by the state law-enforcement division.
"My own view is that this problem is one of many symptoms of
policymakers' trying to load too many functions onto one testing
Walter Haney, Center for the Study of Testing, Evaluation, and Educational Policy.
Vol. 18, Issue 42, Pages 29-33