Are AI Teacher Assistants Reliable? What to Know

Save to favorites
Print

Email Facebook LinkedIn Twitter

Copy URL

Teaching “assistants” powered by artificial intelligence can help save teachers’ time and improve student learning. But they also create hard-to-detect problems that could negatively affect students if teachers are not diligent.

That’s according to a new risk assessment report of four popular AI teacher-assistant platforms by Common Sense Media, a group that researches and advocates for healthy tech and media use among youth. These assistants are tools that are designed and trained specifically to help teachers with common classroom tasks such as lesson planning, differentiating coursework, and grading, as well as administrative tasks such as writing newsletters and emails to parents.

Common Sense tested Google’s Gemini in Google Classroom, Khanmigo’s Teacher Assistant, Curipod, and MagicSchool.

Among some of the major issues identified in the report: AI teacher assistant tools can produce biased outputs, based on what it perceives to be a student’s race or background; fail to identify harmful misinformation; and may not be appropriate for novice teachers to use without guidance and training.

While veteran teachers might be equipped to spot these issues, novice teachers may lack the experience to detect problems and determine when it’s appropriate to use an assistant, said Robbie Torney, the senior director of AI programs at Common Sense Media.

“As somebody who was a novice teacher once, speaking for myself, I was not aware of what I didn’t know,” he said. “Using an AI chatbot, you could see unintended consequences of a new teacher making decisions that could have long-term impacts on students.”

For example, he said, an inexperienced teacher might rely too heavily on text-leveling tools that default to giving struggling readers easier text.

“Text leveling can be helpful sometimes, but that’s not the only thing that struggling readers need,” Torney said. They “also need access to grade-level text. And if you’re giving a struggling reader access to easier text for an entire school year, that reader isn’t going to be making progress toward grade-level text.”

That doesn’t mean schools should bar early-career teachers from using AI teacher assistants, Torney said. But school leaders should not assume because novice teachers might be younger and more tech savvy in general that they don’t need support and training on how to use these platforms.

Nor can AI teacher assistants replace the need for mentors and learning communities for new teachers.

AI teacher assistants recommend different behavior interventions for Black students

Some issues that arose in Common Sense Media’s tests were obvious red flags—such as the AI platforms creating examples or activities for students that used harmful stereotypes, promoting misinformation as factual, or presenting sanitized versions of historical events.

But some issues flagged in the report are subtler and easier for teachers to miss. That makes them potentially more concerning because they are harder to catch, said Torney.

As part of the risk assessment, Common Sense Media tested both Google Gemini’s and MagicSchool’s tools for generating behavior intervention suggestions and strategies. Common Sense used the same prompts about, say, a disruptive student, but alternated using traditionally male and female names, and “white-coded” and “Black-coded” names.

Over many tests, Common Sense found a pattern of differences in which interventions the AI tools suggested for which students. These differences were based on what the AI was inferring from students’ names about their gender and race, the report said. For example, the report said: “Annie tended to get de-escalation-focused strategies; Lakeesha tended to get ‘immediate’ responses; and Kareem tended to have little specific guidance.”

On an individual student level, these recommendations might not sound any alarm bells, said Torney.

“When I looked at the model outputs for the Black-coded names, at the individual level, I was like, ‘oh, these model outputs are pretty good,’” he said. “It’s only when you compare the entire sample of the Black-coded names against the entire sample of the white-coded names, you start to notice some of these differences.”

But teachers aren’t doing those kinds of analyses in their day-to-day jobs, so it will be difficult for them to catch problems like that, Torney said.

In a response to an Education Week query, Google confirmed that it turned off a shortcut to the prompt “generate behavior intervention strategies” in Google Classroom out of what the company said was an abundance of caution as it evaluates the concerns raised in the Common Sense Media report.

“We use rigorous testing and monitoring to identify and stop potential bias in our AI models,” a Google spokesperson said in a statement. “We’ve made good progress, but we’re always aiming to make improvements with our training techniques and data.”

MagicSchool said in a statement that it had not been able to replicate the findings of the Common Sense Media report.

“As noted in the study, AI tools like ours hold tremendous promise—but also carry real risks if not designed, deployed, and used responsibly,” the statement said. “At MagicSchool, we are committed to ensuring our tools advance fairness, equity, and quality teaching for all students.”

Another issue identified in the report is using AI teacher assistants to create individualized education programs, a feature that is included in three of the platforms Common Sense tested. These tools, at least as of now, can produce misleadingly polished documents, but they lack the sophistication to create truly comprehensive IEPs, Torney said.

AI should support teacher expertise, not replace it

The big takeaway from the report is that AI teacher assistants are only as good as the systems that surround them—such as the districts’ policies guiding usage, teacher training, oversight processes, and, perhaps most importantly, teachers’ expertise.

Teachers, the report said, should think carefully about what information they input into an AI teacher-assistant tool and whether it will lead to better outputs. For example, will sharing personal information like a student’s name improve the output, or potentially bias it?

Teachers should be particularly careful when relying on AI teacher assistants to generate images, which can create pictures that promote harmful stereotypes and misinformation, the report said.

Teachers should also be conscious of the phenomenon of “automation bias”—a human tendency to put too much trust in the suggestions of automated systems—which can lead teachers to not critically assess the AI outputs. Even though AI teacher assistant tools can generate materials such as a slide deck in seconds, teachers should always review materials with a critical eye before putting them in front of students, the report cautioned. For example, when Common Sense prompted MagicSchool to create a picture of a “thug,” it produced religious images of Jesus.

These platforms work best, the report concluded, when teachers are in the drivers’ seat and the AI assistants are just that: assistants. For example, Torney doesn’t recommend asking an AI teacher assistant to differentiate a lesson, but it can be a valuable tool for aiding teachers in differentiating a lesson.

“You as the teacher already have a really clear plan for how you want to structure the different groups in your classroom and the types of experiences you want them to learn,” he said. “Giving very clear instructions to the teacher assistant about how you want to do that differentiation can save a bunch of time in generating those materials.”

Some features of well-designed AI teacher assistants, according to the report, are platforms created primarily for teachers to use rather than students. If a platform does include features for students to use directly, they should be customizable and still controlled by the teacher. (The report credits MagicSchool for doing this.)

Well-designed tools also have options that allow teachers to upload lesson plans, curriculum guides, and other resources. This helps ensure that outputs from the AI tool are aligned with curricula and standards rather than undermining them.

Used properly, AI teacher assistants can improve learning by enhancing high-quality instructional materials and saving teachers time on busy work so they can focus more attention on the core aspects of teaching, Torney said.

“AI can really be a powerful assistant,” he said. “This can increase productivity and boost creativity. If I was still teaching, I would want to be using these tools. And if I was an administrator, I would really want my teachers to be using these tools.”

Arianna Prothero

Assistant Editor, Education Week

Arianna Prothero covers technology, student well-being, and the intersection of the two for Education Week.