The Nation’s Report Card Could Be Education’s Data Gold Mine
Opinion
Assessment Opinion

The Nation’s Report Card Could Be Education’s Data Gold Mine

Better support for educators, higher student achievement, improved tests are among the outcomes
By Mark Schneider & John Whitmer — May 18, 2023 5 min read
Abstract illustration of big data technology and artificial intelligence
Nicolas Herrbach/iStock/Getty Images
  • Save to favorites
  • Print
Email Facebook LinkedIn Twitter Copy URL
Mark Schneider & John Whitmer
Mark Schneider is the director of the Institute for Education Sciences, the U.S. Department of Education’s primary research arm. John Whitmer is a senior fellow to the Institute for Education Sciences.

ChatGPT feels like it’s everything, everywhere, all at once (repurposing a great movie title but inserting punctuation). How generative artificial intelligence—AI that creates new text or images (as you can see in ChatGPT, Bing, or DALL-E)—shakes out is unclear: Will we create an artificial superintelligence that displaces humans? Or will we harness its power to improve learning processes and outcomes?

Nobody can predict that future with certainty, but one thing we do know is that generative AI requires large quantities of high-quality, relevant data to be of any value. In the education sciences, we also know that such large-scale, high-quality data are neither everywhere nor all at once. However, the National Assessment of Educational Progress, often known as the Nation’s Report Card, provides carefully collected, valid, and reliable data with rich contextual information about learners while protecting student privacy. In short, NAEP can begin to fulfill the data needs of modern education research. And the National Assessment Governing Board—which sets policy for NAEP and meets this week—should prioritize the release of these data.

As is so often the case, the science is moving faster than the speed of government, but this is one area where we have everything we need to catch up. Given the potential these taxpayer-funded data have to improve support for educators and outcomes for students, there is a clear obligation to make the information available to researchers. As advocates for high-quality, high-impact research, we urge that step.

Since 1969, NAEP has measured student achievement in mathematics, reading, science, writing, arts, history, and civics. NAEP uses a mix of conventional forced-choice items; student essays; short, open-ended responses; and simulations. NAEP also collects “process data” about how students interact with items using the digital-based assessment platform. Further, NAEP collects detailed demographic and self-reported information, which includes the basics (for example, race/ethnicity, gender) and deeper information (for example, English-language-learner status, IEP status, disability accommodations). NAEP’s data mine holds hundreds of thousands of examples of student work coupled with detailed contextual information about students, their school, and their community. We need to use those data to improve AI algorithms that can in turn improve student outcomes.

Automated scoring is among the most widely researched and deployed uses of AI in education. But replicating human scoring is the floor, not the ceiling. Researchers could use NAEP data to explore complex constructs that have more far-reaching implications than scoring—such as categorizing math misconceptions, identifying ways to improve student writing, or understanding the key themes present in student writings about civic engagement.

With NAEP’s large samples and detailed contextual variables about the test-takers, their schools, and their families, we can also learn about the impact of many factors on student achievement.

NAEP can begin to fulfill the data needs of modern education research.

Protecting student privacy is, of course, essential but also not a reason to delay the release of the data, as some argue. Many safeguards are already in place. NAEP’s results reported at the group level means that protecting privacy is easier than individual assessments, because every result is a summary across many individuals. Further, NAEP’s long history and its procedures minimize risk. For example, the information that could identify a particular test-taker is removed even before the data leave the school. There are known solutions to ensure that individual student identities will not be revealed as a result of a small number of students being categorized in any subgroup. Open-ended responses are a bit trickier; NAEP doesn’t control what students put into these fields, and sometimes, they write a bit off-topic, revealing personal data that need to be scrubbed (perhaps noting that “My uncle, Frank Johnson, who lives in Auburn, was once busted for DUI”).

The Institute of Education Science, where we work, is scrupulously addressing privacy concerns in NAEP data. Our recently announced competition (with $100,000 in prizes) asks researchers to solve the difficult problem of using AI to replicate human-assigned scores for open-ended math items. Before NAEP math-assessment data were released to participants, the information was scrubbed for personally identifiable information and sensitive language using automated and human-based reviews. The reviews ensured that neither student identities nor other types of sensitive information such as a social media handle were disclosed. The dataset is being further processed through our internal controls to ensure it is sufficiently safe to release.

Decisions regarding data privacy should be weighed for the relative risk and reward. The value of tapping NAEP’s data gold mine is high, and, given its history and design, the risk to student privacy is low. In short, privacy concerns should not inhibit the release of NAEP data to qualified researchers.

See Also

Photo collage of crystal ball with the word “AI.”
F. Sheehan for Education Week / Getty
Classroom Technology It's Not Just About ChatGPT. Other AI Technologies Are Heading to Schools
Arianna Prothero, May 9, 2023
6 min read

Research using NAEP data could improve NAEP itself but, more importantly, answer questions about how students learn. For NAEP as an assessment, modern research methods could be used to help review and revise the questions, identifying items that specific groups of students find difficult due to wording or issues not related to the underlying construct. This would move beyond standard psychometric analyses through the incorporation of rich contextual data.

NAEP data could have much broader applicability, especially in the context of large-language models—the underlying approach used by generative AI. Most existing large-language models are based on data scraped from all over the web. While OpenAI, the company that created ChatGPT, does not disclose the specific data sources used for model training, ChatGPT is reportedly trained using information from web texts, books, news articles, social media posts, code snippets, and more. There are more than a few examples of ChatGPT providing questionable or toxic responses depending on the prompt it is given. An equally serious (and related) problem is that large-language models do not have access to enough student academic work, leaving them severely anemic just where we need them most. NAEP data could help with fine-tuning these models, making them more accurate and more useful.

We are only beginning to see how the future of education research will be transformed by generative AI—but one thing is crystal clear: NAEP data must be part of that future. Opening up NAEP’s gold mine of data is an easy call. Doing so will allow us to tap into the creativity of the research community to explore what insights we can derive from NAEP data that will be useful to education stakeholders.

NAEP is approaching a $200 million a year operation. While it produces invaluable insights into student achievement, it has not yet delivered on its full promise.

Events

Tue., June 06, 2023, 2:00 p.m. - 3:00 p.m. ET
Student Well-Being Webinar After-School Learning Top Priority: Academics or Fun?
Join our expert panel to discuss how after-school programs and schools can work together to help students recover from pandemic-related learning loss.
Register
Thu., May 18, 2023, 2:00 p.m. - 3:00 p.m. ET
Reading & Literacy Webinar How New Laws Have Shaped Literacy Teaching—And What’s Next
Learn about the impact of state mandates on reading instruction in schools, including insights on new practices, research findings, and legislative developments.
Register
Thu., May 25, 2023, 2:00 p.m. - 4:30 p.m. ET
Mathematics K-12 Essentials Forum Math Foundations For All
Examine the roots of early math instruction, including fluency, word problems, parent engagement, and how to help struggling students.
Register
See More Events

EdWeek Top School Jobs

Teacher Jobs
Search over ten thousand teaching jobs nationwide — elementary, middle, high school and more.
View Jobs
Principal Jobs
Find hundreds of jobs for principals, assistant principals, and other school leadership roles.
View Jobs
Administrator Jobs
Over a thousand district-level jobs: superintendents, directors, more.
View Jobs
Support Staff Jobs
Search thousands of jobs, from paraprofessionals to counselors and more.
View Jobs
Create Your Own Job Search

Read Next

Assessment Letter to the Editor State Exams Offer Pathways for Some—Not All—Learners
A parent writes a letter to the editor detailing her child's experience with state exams in New York.
1 min read
Education Week opinion letters submissions
Gwen Keraval for Education Week
Assessment Q&A How to Use Formative Assessment to Accelerate Learning
An assessment expert explains how the technique helps teachers find and address the holes in student learning.
Sarah D. Sparks
6 min read
Week 2: Formative Assessments 2700x1806
Adam Niklewicz for Education Week
Assessment Fighting Senioritis? This New Requirement Kept a Graduating Class Engaged
These seniors were the first to test a new state requirement to reflect on what they had learned over four years of high school.
Elizabeth Heubeck
6 min read
A mastery-based learning program was implemented at Haddam-Killingworth High School in Higganum, Conn., by Principal Donna Hayward. Ms. Hayward was named the 2023 National Principal of the Year by the National Association of Secondary Principals. The vision of the program, created for graduating seniors, is to ensure all students are prepared for college, career, and civic life through multiple and flexible pathways for learning, including mastery-based systems of accountability for student growth. Principal Hayward, center, stands in the school library with four seniors who completed the program (from left), Abby Jones, 17, of Killingworth, Jack Fergusson, 17, of Haddam, Callen Powers, 17, of Haddam, and Anadalay Garcia, 18, of Higganum.
Principal Donna Hayward, center, and graduating seniors (from left), Abby Jones, Jack Ferguson, Callen Powers, and Anadalay Garcia, said they all learned something from the their state's graduation assessment project. They gathered at Haddam-Killingworth High School in Higganum, Conn., on April 20, 2023.
Christopher Capozziello for Education Week
Assessment The Feds Gave States the Chance to Create Better Standardized Tests. There Were Few Takers
While many states were initially excited about the flexibility, they had second thoughts when they looked more closely at the details.
Alyson Klein
4 min read
Image of students taking a test.
smolaw11/iStock/Getty
Load More ▼