Artificial Intelligence What the Research Says

How AI Simulations Match Up to Real Students—and Why It Matters

By Sarah D. Sparks — September 10, 2025 4 min read
AI Skeptic 1244482154
  • Save to favorites
  • Print

AI-simulated students consistently outperform real students—and make different kinds of mistakes—in math and reading comprehension, according to a new study.

That could cause problems for teachers, who increasingly use general prompt-based artificial intelligence platforms to save time on daily instructional tasks. Sixty percent of K-12 teachers report using AI in the classroom, according to a June Gallup study, with more than 1 in 4 regularly using the tools to generate quizzes and more than 1 in 5 using AI for tutoring programs. Even when prompted to cater to students of a particular grade or ability level, the findings suggest underlying large language models may create inaccurate portrayals of how real students think and learn.

“We were interested in finding out whether we can actually trust the models when we try to simulate any specific types of students. What we are showing is that the answer is in many cases, no,” said Ekaterina Kochmar, co-author of the study and an assistant professor of natural-language processing at the Mohamed bin Zayed University of Artificial Intelligence in the United Arab Emirates, the first university dedicated entirely to AI research.

See also

Photo collage of two teachers working on laptop computer.
Education Week + Getty

How the study tested AI “students”

Kochmar and her colleagues prompted 11 large language models (LLMs), including those underlying generative AI platforms like ChatGPT, Qwen, and SocraticLM, to answer 249 mathematics and 240 reading grade-level questions on the National Assessment of Educational Progress in reading and math using the persona of typical students in grades 4, 8, and 12. The researchers then compared the models’ answers to NAEP’s database of real student answers to the same questions to measure how closely AI-simulated students’ answers mirrored those of actual student performance.

The LLMs that underlie AI tools do not think but generate the most likely next word in a given context based on massive pools of training data, which might include real test items, state standards, and transcripts of lessons. By and large, Kochmar said, the models are trained to favor correct answers.

“In any context, for any task, [LLMs] are actually much more strongly primed to answer it correctly,” Kochmar said. “That’s why it’s very difficult to force them to answer anything incorrectly. And we’re asking them to not only answer incorrectly but fall in a particular pattern—and then it becomes even harder.”

For example, while a student might miss a math problem because he misunderstood the order of operations, an LLM would have to be specifically prompted to misuse the order of operations.

None of the tested LLMs created simulated students that aligned with real students’ math and reading performance in 4th, 8th, or 12th grades. Without specific grade-level prompts, the proxy students performed significantly higher than real students in both math and reading—scoring, for example, 33 percentile points to 40 percentile points higher than the average real student in reading.

Kochmar also found that simulated students “fail in different ways than humans.” While specifying specific grades in prompts did make simulated students perform more like real students with regard to how many answers they got correct, they did not necessarily follow patterns related to particular human misconceptions, such as order of operations in math.

The researchers found no prompt that fully aligned simulated and real student answers across different grades and models.

What this means for teachers

For educators, the findings highlight both the potential and the pitfalls of relying on AI-simulated students, underscoring the need for careful use and professional judgment.

“When you think about what a model knows, these models have probably read every book about pedagogy, but that doesn’t mean that they know how to make choices about how to teach,” said Robbie Torney, the senior director of AI programs at Common Sense Media, which studies children and technology.

Torney was not connected to the current study, but last month released a study of AI-based teaching assistants that similarly found alignment problems. AI models produce answers based on their training data, not professional expertise, he said. “That might not be bad per se, but it might also not be a good fit for your learners, for your curriculum, and it might not be a good fit for the type of conceptual knowledge that you’re trying to develop.”

This doesn’t mean teachers shouldn’t use general prompt-based AI to develop tools or tests for their classes, the researchers said, but that educators need to prompt AI carefully and use their own professional judgement when deciding if AI outputs match their students’ needs.

“The great advantage of the current technologies is that it is relatively easy to use, so anyone can access [them],” Kochmar said. “It’s just at this point, I would not trust the models out of the box to mimic students’ actual ability to solve tasks at a specific level.”

Torney said educators need more training to understand not just the basics of how to use AI tools but their underlying infrastructure. “To be able to optimize use of these tools, it’s really important for educators to recognize what they don’t have, so that they can provide some of those things to the models and use their professional judgement.”

Events

This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
College & Workforce Readiness Webinar
Smarter Tools, Stronger Outcomes: Empowering CTE Educators With Future-Ready Solutions
Open doors to meaningful, hands-on careers with research-backed insights, ideas, and examples of successful CTE programs.
Content provided by Pearson
Recruitment & Retention Webinar EdRecruiter 2026 Survey Results: How School Districts are Finding and Keeping Talent
Discover the latest K-12 hiring trends from EdWeek’s nationwide survey of job seekers and district HR professionals.
This content is provided by our sponsor. It is not written by and does not necessarily reflect the views of Education Week's editorial staff.
Sponsor
Professional Development Webinar
Recalibrating PLCs for Student Growth in the New Year
Get advice from K-12 leaders on resetting your PLCs for spring by utilizing winter assessment data and aligning PLC work with MTSS cycles.
Content provided by Otus

EdWeek Top School Jobs

Teacher Jobs
Search over ten thousand teaching jobs nationwide — elementary, middle, high school and more.
View Jobs
Principal Jobs
Find hundreds of jobs for principals, assistant principals, and other school leadership roles.
View Jobs
Administrator Jobs
Over a thousand district-level jobs: superintendents, directors, more.
View Jobs
Support Staff Jobs
Search thousands of jobs, from paraprofessionals to counselors and more.
View Jobs

Read Next

Artificial Intelligence Fed Regulation of AI Is Virtually Nonexistent. Is This a Problem for Schools?
The Trump administration wants to unleash AI to let it innovate in education and other sectors.
4 min read
Art teacher Lindsay Johnson, center, has students explore how to use generative AI features in Canva at Roosevelt Middle School, on June 25, 2025, in River Forest, Ill. The Education and Workforce Committee held a hearing on Wednesday over the lack of federal regulation and guidance for how schools and other organizations should use AI.
Art teacher Lindsay Johnson, center, has students explore how to use generative AI features in Canva at Roosevelt Middle School, on June 25, 2025, in River Forest, Ill. The U.S. House of Representatives' Education and Workforce Committee held a hearing on Wednesday over the lack of federal regulation and guidance for how schools and other organizations should use AI.
Nam Y. Huh/AP
Artificial Intelligence From Our Research Center More Teachers Are Using AI in Their Classrooms. Here's Why
But there's still a big number of teachers who don't plan to use the technology.
3 min read
Teacher and kids using tablets and artificial intelligence in school classroom; a.i. assisted lessons.
iStock/Getty and Education Week
Artificial Intelligence Video Is the ‘AI Glow’ Starting to Wear Off? What to Expect in 2026
Artificial intelligence is now integrated into a wide variety of products and services that K-12 schools use, making it almost inescapable.
1 min read
English teacher Casey Cuny reads in his classroom as a screen displays guidelines for using artificial intelligence at Valencia High School in Santa Clarita, Calif., on Aug. 27, 2025.
English teacher Casey Cuny reads in his classroom as a screen displays guidelines for using artificial intelligence at Valencia High School in Santa Clarita, Calif., on Aug. 27, 2025.
Jae C. Hong/AP
Artificial Intelligence Quiz Quiz Yourself: How Is AI Reshaping K-12 Career and Technical Education?
Test your knowledge on AI trends in students' careers.
1 min read
Students in Bentonville public schools’ Ignite program work on projects during class on Nov. 5, 2025, in Bentonville, Ark. The program offer career-pathway training for juniors and seniors in the district.
Students in the Bentonville public schools' Ignite program work on projects during class on Nov. 5, 2025, in Bentonville, Ark. The program offer career-pathway training for juniors and seniors in the district.
Wesley Hitt for Education Week