Across the country, school districts are moving quickly to adopt AI, even as they try to establish guardrails around it. Chicago public schools’ guidebook, for example, touts AI’s potential to elevate educational delivery. New York City public schools’ recently released guidelines aim to establish “a foundational vision for how to use AI going forward.” And Nevada’s state-level guide explicitly “embraces AI in its schools.” Thirty-four states now have official AI guidance or policies for K-12 schools, and the message seems clear: The question is no longer whether to integrate AI but how.
Embedded in that message, however, is the assumption that AI improves student learning and prepares young people for the workforce. The research tells a more complicated—and cautionary—story.
A field experiment involving nearly 1,000 high school students last year suggested that ChatGPT use harms learning. A University of Pennsylvania research team found that students who used it demonstrated a 48% improvement in grades while working with an AI tutor, but when AI access was removed, they scored 17% worse than students who had never used AI at all.
A recent report from Stanford University’s AI Hub for Education confirmed the pattern: AI tools often boost academic performance while students have access, but those gains weaken or vanish when students are assessed on their own without using AI tools. The takeaway is sobering: Short-term performance gains can mask long-term learning deficits. AI is a crutch.
Durable learning requires cognitive engagement, productive struggle, and repetition. When students use AI to summarize a reading, draft an essay, or work through a problem, they may produce polished outputs, but they are not building lasting knowledge. The shortcut bypasses the very processes that develop learning.
This does not mean AI has no place in K-12 classrooms. It means its place in schools must be defined carefully, with student learning—not efficiency, improved educational experiences, or career readiness—as the primary criterion.
The research points toward a clear default for AI policies to require: Design lessons around the conditions known to produce durable learning first and integrate AI only when it genuinely supports—rather than substitutes for—those conditions. Those conditions, articulated by neuroscientist and educator Jared Horvath, include:
- Building and reinforcing a strong domain knowledge base;
- Opportunities for deep processing and productive struggle;
- Independent and critical thinking; and
- Meaningful human interaction.
Before adding AI to any lesson, activity, or assignment, teachers and instructional leaders should work through each of the four questions below. This will enable educators to translate these conditions for learning into a practical decisionmaking tool for appropriate AI use.
Question 1: Will students have to use, recall, and demonstrate core content knowledge?
Higher-order thinking is built on a foundation of domain knowledge. Think about a middle school persuasive-essay assignment. Students cannot analyze a persuasive essay before understanding what a thesis or a counterargument is. If an AI tool actively engages students with foundational content—through retrieval practice, targeted feedback, or elaborative questioning—it may be worth integrating. If it lets students bypass that content, it is almost certainly counterproductive.
Question 2: Will students have to apply their learning to a new context?
Transfer—applying knowledge to a new situation—is one of the most reliable signs of genuine understanding. When a student applies the structures of a persuasive essay to a new essay topic, they are moving information from working memory into lasting knowledge. If an AI tool scaffolds that transfer while preserving cognitive effort, it may add value. If it does the transfer for the student, learning is short-circuited.
Question 3: Will students have to think independently and defend their own reasoning?
Critical thinking requires students to make judgments and defend them. Is it more valuable to ask students to analyze an AI-generated essay or to write one themselves? Writing demands that students wrestle with real questions: How do I open this? How do I build a clear argument? What can I cut? That productive struggle—brainstorming, drafting, and revising—is the learning. AI-generated scaffolding, however well-intentioned, can short-circuit it entirely. It’s like taking a shortcut at the gym—you skip the part that actually builds strength.
AI integration that substitutes the tool’s judgment for the student’s own undermines this process. A simple test: After completing an assignment using AI, can the student explain in their own words why they made the choices they made?
Question 4: Will meaningful human interaction be preserved?
Peer feedback, collaborative problem-solving, and teacher-student dialogue do more than support academic learning—they develop the social and intellectual habits that define educated citizens. Human interaction sparks curiosity, broadens perspective, and builds the kind of trust and accountability that AI cannot replicate. Before adding any AI tool, ask whether it complements or competes with these interactions. An AI-enhanced discussion board that replaces peer response with algorithmic feedback may sacrifice more than it gains.
AI is not going away, and blanket resistance serves no one. There are genuine uses where AI supports learning without undermining it. But the pace of adoption in K-12 is far outrunning the pace of evidence. Teachers are often caught in the middle—pressured to integrate tools their districts endorse and their students already use, without any official guidance on whether doing so will help or harm the students they are trying to educate.
It is also worth remembering that AI developers have profit motives that have nothing to do with improving student outcomes. The enthusiasm of technology companies is not evidence of pedagogical effectiveness.
The four questions above will not resolve every instructional dilemma, but they provide a principled starting point. If an AI tool cannot clearly support content knowledge, productive struggle, independent reasoning, and human interaction—the conditions we know produce learning—the default should be to leave it out. The burden of proof should be on the technology and the policymakers pushing it, not on the teacher who questions it.