Opinion
Assessment Opinion

The Nation’s Report Card Could Be Education’s Data Gold Mine

Better support for educators, higher student achievement, improved tests are among the outcomes
By Mark Schneider & John Whitmer — May 18, 2023 5 min read
Abstract illustration of big data technology and artificial intelligence
  • Save to favorites
  • Print

ChatGPT feels like it’s everything, everywhere, all at once (repurposing a great movie title but inserting punctuation). How generative artificial intelligence—AI that creates new text or images (as you can see in ChatGPT, Bing, or DALL-E)—shakes out is unclear: Will we create an artificial superintelligence that displaces humans? Or will we harness its power to improve learning processes and outcomes?

Nobody can predict that future with certainty, but one thing we do know is that generative AI requires large quantities of high-quality, relevant data to be of any value. In the education sciences, we also know that such large-scale, high-quality data are neither everywhere nor all at once. However, the National Assessment of Educational Progress, often known as the Nation’s Report Card, provides carefully collected, valid, and reliable data with rich contextual information about learners while protecting student privacy. In short, NAEP can begin to fulfill the data needs of modern education research. And the National Assessment Governing Board—which sets policy for NAEP and meets this week—should prioritize the release of these data.

As is so often the case, the science is moving faster than the speed of government, but this is one area where we have everything we need to catch up. Given the potential these taxpayer-funded data have to improve support for educators and outcomes for students, there is a clear obligation to make the information available to researchers. As advocates for high-quality, high-impact research, we urge that step.

Since 1969, NAEP has measured student achievement in mathematics, reading, science, writing, arts, history, and civics. NAEP uses a mix of conventional forced-choice items; student essays; short, open-ended responses; and simulations. NAEP also collects “process data” about how students interact with items using the digital-based assessment platform. Further, NAEP collects detailed demographic and self-reported information, which includes the basics (for example, race/ethnicity, gender) and deeper information (for example, English-language-learner status, IEP status, disability accommodations). NAEP’s data mine holds hundreds of thousands of examples of student work coupled with detailed contextual information about students, their school, and their community. We need to use those data to improve AI algorithms that can in turn improve student outcomes.

Automated scoring is among the most widely researched and deployed uses of AI in education. But replicating human scoring is the floor, not the ceiling. Researchers could use NAEP data to explore complex constructs that have more far-reaching implications than scoring—such as categorizing math misconceptions, identifying ways to improve student writing, or understanding the key themes present in student writings about civic engagement.

With NAEP’s large samples and detailed contextual variables about the test-takers, their schools, and their families, we can also learn about the impact of many factors on student achievement.

NAEP can begin to fulfill the data needs of modern education research.

Protecting student privacy is, of course, essential but also not a reason to delay the release of the data, as some argue. Many safeguards are already in place. NAEP’s results reported at the group level means that protecting privacy is easier than individual assessments, because every result is a summary across many individuals. Further, NAEP’s long history and its procedures minimize risk. For example, the information that could identify a particular test-taker is removed even before the data leave the school. There are known solutions to ensure that individual student identities will not be revealed as a result of a small number of students being categorized in any subgroup. Open-ended responses are a bit trickier; NAEP doesn’t control what students put into these fields, and sometimes, they write a bit off-topic, revealing personal data that need to be scrubbed (perhaps noting that “My uncle, Frank Johnson, who lives in Auburn, was once busted for DUI”).

The Institute of Education Sciences, where we work, is scrupulously addressing privacy concerns in NAEP data. Our recently announced competition (with $100,000 in prizes) asks researchers to solve the difficult problem of using AI to replicate human-assigned scores for open-ended math items. Before NAEP math-assessment data were released to participants, the information was scrubbed for personally identifiable information and sensitive language using automated and human-based reviews. The reviews ensured that neither student identities nor other types of sensitive information such as a social media handle were disclosed. The dataset is being further processed through our internal controls to ensure it is sufficiently safe to release.

Decisions regarding data privacy should be weighed for the relative risk and reward. The value of tapping NAEP’s data gold mine is high, and, given its history and design, the risk to student privacy is low. In short, privacy concerns should not inhibit the release of NAEP data to qualified researchers.

See Also

Photo collage of crystal ball with the word “AI.”
F. Sheehan for Education Week / Getty

Research using NAEP data could improve NAEP itself but, more importantly, answer questions about how students learn. For NAEP as an assessment, modern research methods could be used to help review and revise the questions, identifying items that specific groups of students find difficult due to wording or issues not related to the underlying construct. This would move beyond standard psychometric analyses through the incorporation of rich contextual data.

NAEP data could have much broader applicability, especially in the context of large-language models—the underlying approach used by generative AI. Most existing large-language models are based on data scraped from all over the web. While OpenAI, the company that created ChatGPT, does not disclose the specific data sources used for model training, ChatGPT is reportedly trained using information from web texts, books, news articles, social media posts, code snippets, and more. There are more than a few examples of ChatGPT providing questionable or toxic responses depending on the prompt it is given. An equally serious (and related) problem is that large-language models do not have access to enough student academic work, leaving them severely anemic just where we need them most. NAEP data could help with fine-tuning these models, making them more accurate and more useful.

We are only beginning to see how the future of education research will be transformed by generative AI—but one thing is crystal clear: NAEP data must be part of that future. Opening up NAEP’s gold mine of data is an easy call. Doing so will allow us to tap into the creativity of the research community to explore what insights we can derive from NAEP data that will be useful to education stakeholders.

NAEP is approaching a $200 million a year operation. While it produces invaluable insights into student achievement, it has not yet delivered on its full promise.

A version of this article appeared in the May 31, 2023 edition of Education Week as The Nation’s Report Card Could Be Education’s Data Gold Mine

Events

This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
Student Achievement Webinar
How To Tackle The Biggest Hurdles To Effective Tutoring
Learn how districts overcome the three biggest challenges to implementing high-impact tutoring with fidelity: time, talent, and funding.
Content provided by Saga Education
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
Student Well-Being Webinar
Reframing Behavior: Neuroscience-Based Practices for Positive Support
Reframing Behavior helps teachers see the “why” of behavior through a neuroscience lens and provides practices that fit into a school day.
Content provided by Crisis Prevention Institute
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
Mathematics Webinar
Math for All: Strategies for Inclusive Instruction and Student Success
Looking for ways to make math matter for all your students? Gain strategies that help them make the connection as well as the grade.
Content provided by NMSI

EdWeek Top School Jobs

Teacher Jobs
Search over ten thousand teaching jobs nationwide — elementary, middle, high school and more.
View Jobs
Principal Jobs
Find hundreds of jobs for principals, assistant principals, and other school leadership roles.
View Jobs
Administrator Jobs
Over a thousand district-level jobs: superintendents, directors, more.
View Jobs
Support Staff Jobs
Search thousands of jobs, from paraprofessionals to counselors and more.
View Jobs

Read Next

Assessment What the Research Says What Teachers Should Know About Integrating Formative Assessment With Instruction
Teachers need to understand how tests fit into their larger instructional practice, experts say.
3 min read
Students with raised hands.
E+ / Getty
Assessment AI May Be Coming for Standardized Testing
An international test may offer clues on how AI can help create better assessments.
4 min read
online test checklist 1610418898 brightspot
champpixs/iStock/Getty
Assessment The 5 Burning Questions for Districts on Grading Reforms
As districts rethink grading policies, they consider the purpose of grades and how to make them more reliable measures of learning.
5 min read
Grading reform lead art
Illustration by Laura Baker/Education Week with E+ and iStock/Getty
Assessment As They Revamp Grading, Districts Try to Improve Consistency, Prevent Inflation
Districts have embraced bold changes to make grading systems more consistent, but some say they've inflated grades and sent mixed signals.
10 min read
Close crop of a teacher's hands grading a stack of papers with a red marker.
E+