AI chatbots are like students who don’t do the reading and raise their hand anyway.
A new paper from researchers at OpenAI, the company behind ChatGPT, finds that a major reason AI hallucinates is because large language models are engineered to eliminate uncertainty. An AI chatbot that says “I don’t know” gets the same training score as one that offers incorrect information. As a result, AI provides confident-sounding answers that are frequently wrong. Many people are swayed by AI’s display of certainty, conflating the presentation of information with its quality.
Education should teach students to grapple with complexity. AI is designed to avoid it. This mismatch is yet another reason to slow roll the rush to put AI in students’ hands. It also points to a question teachers should ask themselves when evaluating how to integrate AI into the classroom: Can I use this tool in a way that models the thinking I want to teach my students?
For more than two decades, the Digital Inquiry Group and its earlier iteration at Stanford University have created curriculum that teaches students to read and think like historians, centered on document-based inquiry. After a 2016 study showed that young people struggle to evaluate online sources, our research group developed curriculum to help students separate fact from fiction on the internet.
What historical thinking and online reasoning share is the imperative to look beyond the surface of information and instead seek a broad context before diving in. Information doesn’t come out of nowhere: It’s authored by someone, somewhere, for some purpose. These considerations are essential in deciding what to trust.
Historians approach a document by sourcing it, glancing briefly at its contents before darting to the bottom to ponder its date, author, and relationship to the events it describes. These crucial details frame historians’ subsequent reading.
Similarly, when we studied how professional fact-checkers at the nation’s leading news outlets approach an unfamiliar website, we noticed that they almost immediately opened new tabs and read laterally to gain context. To investigate an unfamiliar digital text, a savvy reader, paradoxically, first needs to leave it.
The approach of both historians and fact-checkers differs from how students interact with historical texts, social media posts, and more recently, AI chatbot responses.
Many students see historical documents as vessels of information, not altogether different from their textbook, and are oblivious to authorship and inattentive to historical context. Likewise, students and adults alike are often swayed by the appearance of a video on social media or the authoritative tone of a website’s About page. Our most recent data at the Digital Inquiry Group suggest the same pattern may be emerging when it comes to AI—and that the chatbot’s confident tone is a likely culprit.
In a pilot study where students used AI to search the internet, we asked them to evaluate an answer from ChatGPT that failed to cite sources. “It touches on everything you could think of to ask,” said one student. “It gave a detailed response,” wrote another. We don’t want students to approach any source as the sole arbiter of truth, be it a traditional textbook or a shiny new chatbot. We never want them to rely on fabricated citations when they can’t find a source to support their argument—something that even professionals have been caught doing.
What we want is for students to weigh evidence. To recognize the challenging, fascinating, and rewarding process of piecing together a coherent account from multiple sources. To learn that admitting what we don’t know is its own achievement, its own unique form of knowledge.
And this is what concerns us about the findings of OpenAI’s researchers. Chatbots, the researchers admit, are programmed to provide authoritative responses to complex, thorny, and often unanswerable questions. The companies designing chatbots disincentivize the very expression of uncertainty that is so crucial in the classroom.
We shouldn’t hold our breath waiting for AI companies to fix their models. The good news is that AI is malleable enough, and individual users capable enough, that educators can take immediate action to apply lessons we already have about good thinking, good research, and good education.
Take, for example, an AI response that doesn’t cite sources. Even a few hours of instruction can get students to pay more attention to where information comes from. Our research group is now experimenting with teaching students how to prompt a chatbot to cite its sources. Our goal is to nudge students to be skeptical of AI answers pulled from Reddit threads and random blog posts and instead direct the model to sources that reflect subject-matter expertise. Students need to see their interaction with a chatbot as a process, much like knowledge creation, rather than a one-and-done exchange.
Despite its flaws, AI can serve as a powerful contextualization portal. But only if the people using it recognize how fallible it can be, how much we still don’t know about how it works, and learn how to prompt it so that it produces quality responses.
Information expert Mike Caulfield, for example, has illustrated how asking a chatbot to weigh the evidence for and against a claim can produce significantly better responses than just asking for a simple answer. With a more specific prompt, chatbots will often include qualifications about expert disagreement or lack of scholarly consensus.
Good educators don’t punish their students for uncertainty. And that means good educators should be cautious about placing a technology in students’ hands that’s trained to avoid saying “I don’t know.”
Our research group has long advocated digital literacy instruction and criticized approaches that tell students to stay away from search engines and shelter in the safety of peer-reviewed databases. But there is a difference between teaching students how to drive safely and throwing them in an F1 sports car before they have a license.
Too much of AI instruction right now looks like the latter. When it comes to AI in schools, all of us need a dose of humility that AI, at least for now, clearly lacks.