'De-Identifying' Student Data Is Key for Protecting Privacy
School districts and the companies that work with them need to put a greater emphasis on ensuring that the massive amounts of student data collected every year are cleared of identifying information before records are shared or released to outside organizations.
While many districts and vendors are contemplating how to keep student information confidential, the “de-identification” of sensitive student data is not always technically simple to do, and a lack of resources means some districts are ill-equipped for the task, said Jules Polonetsky, the executive director of the Future of Privacy Forum.
“De-identification isn’t easy,” Polonetsky said. “It’s really hard when you’re dealing with an open system like schools, where parents, volunteers, and teachers need access to student data for collaboration, but at the same time we want the best standards of privacy control.”
To shed light on the legal and technical gray area involving the sharing and use of student data, the forum released a paper this month on the de-identification of sensitive student information.
Ideally, that process involves purging student records of any information directly linked to an individual student, as well as removing or obscuring any indirect information that could allow others to figure out who a student is, before the records are shared with a third party.
But that is easier said than done in an era in which students generate significant amounts of digital information, data brokers use public and private sources to amass extensive profiles of individuals, and researchers and vendors use increasingly advanced statistical and mathematical techniques in the course of everyday business.
Room for Interpretation
Personally identifiable information generally includes a student’s name and address, the names of family members, and such individual identifiers as a Social Security number or student ID. Examples of indirect identifiers might include a student’s date and place of birth, race, religion, weight, financial information, and mother’s maiden name.
“De-identification” of student data is the process of purging or altering identifiable information from student records before it is shared with organizations outside the school district. Personally identifiable information generally includes a student’s name and address, the names of family members, and personal identifiers, such as Social Security number or student ID. Indirect identifiers might include the student’s date and place of birth, race, religion, weight, financial information, or mother’s maiden name.
Three key techniques for de-identifying student data include:
• Blurring: Reducing the precision of disclosed data to minimize individual identification. This may involve grouping data into broader categories so unique cases are not highlighted. For example, student information might be placed in a “minority” category rather than individual categories, such as African-American or Hispanic.
• Perturbation: Making small changes to data to prevent individual identification from unique or rare population groups. Some data can be swapped among individuals for analyses in which certain factors don’t matter. For instance, if birthplace is irrelevant, one student’s birthplace can be switched with another’s.
• Suppression: Removing some forms of data, such as race or place of birth, altogether to prevent individual identification, particularly within small groups.
At the federal level, the handling of such information is governed primarily by the federal Family Educational Rights and Privacy Act. FERPA, as the law is commonly known, “prohibits the disclosure of education records containing personally identifiable student data without parent or eligible student consent,” according to the forum’s paper.
But that leaves open considerable room for interpretation—and disagreement.
The forum, which is closely aligned with industry and is the prime mover behind a voluntary pledge on protection of student-data privacy that more than 160 companies have signed, believes that “appropriately de-identified” student information is not covered by FERPA.
“Properly de-identified student data thus may be shared without limitation under FERPA (although other federal and state privacy laws may apply),” the group writes. “Furthermore, ‘de-identified’ information from education records is not subject to any destruction requirements because, by definition, it is not ‘personally identifiable information.’ ”
But Fordham University law professor and privacy expert Joel Reidenberg, also an academic adviser to the forum, pointed out what he views as problems with that approach and the ways in which it highlights “holes in FERPA’s scope of coverage.”
Technical and statistical advances have made it “easier and easier” to take information that has ostensibly been “de-identified” and link it back to individual students, he said.
In addition, Reidenberg said, “customized profiles” of individual students constructed on the basis of their interactions with technology may not include a student’s name or address, but are still being used to make critical decisions about what and how students are taught.
Most troubling, he maintained, is that neither parents nor advocacy groups appear to have standing to challenge the ways in which FERPA is being interpreted, applied, and regulated by the U.S. Department of Education. When the Electronic Privacy Information Center tried to sue the department over controversial regulations it issued in 2011, the courts dismissed the case, saying the organization lacked legal standing.
Some of that situation could be poised to change. In recent months, bills have been introduced in Congress that would either rewrite FERPA or create entirely new federal privacy laws aimed at protecting student information.
Setting District Standards
In the meantime, school districts are left to grapple with the issue.
Bob Moore, an education technology consultant who leads privacy initiatives for the Consortium for School Networking, said there’s a wide variety of understanding and action on de-identification of student data and other privacy issues among districts.
“Some are more sophisticated,” he said. “Others aren’t sure what their role is in doing this or how important it is.”
The 52,500-student Howard County, Md., district has taken the issue seriously. It created the position of coordinator of data privacy, and it strictly controls how information is released.
Those controls, said Teddy Hartman, the data-privacy coordinator, include suppressing data on groups containing fewer than 10 students; encouraging some vendors to use school or classroom accounts instead of individual student accounts when possible; minimizing the amount of student data vendors have access to in order to make their digital tools work; and making vendors sign data-privacy contracts that are stricter than the forum’s privacy-protection pledge.
The district uses FERPA as the “foundation, not the ceiling,” Hartman said. “It’s a baseline. We certainly strive to be more protective around data.”
In addition, he said, any new digital tools—including programs, apps, and software being considered for use—must go through a stringent process that includes privacy concerns from the start.
“Privacy has to be thought of upfront as you’re designing programs, not as an afterthought,” he said.
Elsewhere, districts are struggling with how to keep student data safe.
A detailed Louisiana privacy law that went into effect during the last school year bars schools from collecting more than two pieces of personally identifiable information about students without parental consent, among other requirements. Despite guidance from state education officials, districts there have struggled to comply with the law.
But Sheryl Abshire, the chief technology officer for the 32,600-student Calcasieu Parish schools in Lake Charles, La., said the new law is pushing districts to confront the issues.
Among other strategies, Abshire said, she had to make every vendor working with her district sign a contract addendum with extensive requirements on student-data privacy. She said most vendors she’s worked with seem fairly sophisticated about de-identifying student data and addressing other privacy issues.
Because of the state law, Louisiana districts may be ahead of others in the country, Abshire said.
But she also said that fears about possible misuse of student data shouldn’t prevent districts from using technology extensively to improve teaching and learning.
“We must be responsible around data, but also responsible around student learning,” she said. “We shortchange students and our community if we step back and say, ‘This is too complicated, so we’re not going to do it.’ ”
Vol. 35, Issue 02, Pages 12-13Published in Print: August 26, 2015, as Schools Urged to Put a Higher Priority On 'De-Identifying' Student Data