Teaching Students to Wrangle 'Big Data'
Data science classes are popping up in schools
Jasmin Perez is in math class, and her assignment is to figure out how likely she is to die.
It's an unusual exercise in the world of high school math, where most students are calculating angles or solving for x. But Jasmin and her 17 classmates are in a new and rare kind of math course: data science.
It's a blend of statistics and computer science, designed to help students understand and use the "big data" that shapes modern life. The labor market is hungry for skilled data wranglers, and pays them well, but few schools offer data-science classes.
"Today we will be simulating you guys living or dying," said Brent Rojo, a math teacher here at Venice High School, which serves predominantly nonwhite, low-income students on this city's western edge.
Rojo led the students through an exercise in applied probability, sparked by a study that found male characters in horror movies are more likely to die than females. Laughing nervously, the students draw slips of paper from a stack in their teacher's hand to find out whether they "live" or "die," and line up accordingly on opposite sides of the room. They do this several times, with Rojo reshuffling the stack of papers.
They calculate the difference between male and female "survival" rates and discuss whether patterns in the data were shaped by chance. They'll create dot plots of their results, and use a powerful computer-programming language called R to analyze, display, and interpret their data.
"I like this class because it's more relatable to me," said Jasmin, 17. In other math classes, she said, she's always wondered why she had to learn that stuff, but not in this one.
During the year, these students will do statistics projects about themselves, using their phones to keep track of how they spend their time, or the snack foods they eat, and using R for analysis. They'll explore topics as diverse as Republicans' and Democrats' views of impeachment, the factors behind the rise of wildfires, and how passengers' chances of survival on the Titanic were linked to the price of their ticket. (Those with pricier tickets were more likely to survive.)
The teenagers in Rojo's Introduction to Data Science class are among the few nationally with access to such a course. The Los Angeles Unified school district and the University of California-Los Angeles wrote the free curriculum in 2010 with a grant from the National Science Foundation, and high schools began teaching it in 2014.
This year, more than 3,200 students are taking it in 51 schools across Southern California, and the curriculum's creators are talking with several other states that are interested in using it.
Data Skills in Demand
From a labor-market point of view, there is abundant reason to build students' data acumen. "Data scientist" has topped Glassdoor's list of best jobs in America for the last four years, with a median base salary of $108,000. It's repeatedly made LinkedIn's list of "most promising" jobs because of good pay and high demand.
And fields as varied as health care, manufacturing, and K-12 school administration pay higher wages to employees with strong data-analysis skills, according to the analytics and software company Burning Glass Technologies.
The need for data-science skills is showing up in College Board products, too. The company is weaving that content into Advanced Placement courses such as government and biology, the company's president, David Coleman, said in a "Freakonomics" podcast.
And even on the reading/writing portion of the new SAT, students face questions that require them to use data from charts and graphs.
"Data rules our world. We need to be speaking to this idea a great deal more than we do," said Paul Myers, who has taught math for 50 years and who has created a data-science course for his high school students at Paideia School, a private pre-K-12 school in Atlanta.
"Bless their hearts, most schools have had this fixed curriculum in place forever: algebra, geometry, trig, maybe calculus. We are living in a world of high school math curriculum that doesn't acknowledge to any real degree that big data is out there."
Rethink Math Sequences?
Discussion about data-science courses clicks into a larger national conversation about reworking math offerings. All but 16 percent of high school students bail out of math before taking any calculus, and few of those who take it go on to advanced math in college. One study showed that barely 1 in 3 college graduates report needing content from Algebra 2 or beyond in their daily lives. All of that suggests a need for new kinds of math courses, some advocates say.
"Too often, high school math is about filtering out kids, not engaging them," said Phil Daro, a lead author of the Common Core State Standards in math, and a co-author of a recent paper that argued for creation of better math options for students who are more likely to study history or music than aeronautics.
"Math is important for a broad range of people, not just math majors, but the math non-STEM majors need is different. It's more applied," Daro said.
Aside from California's IDS course, however, there are few data-science curricula available. One well-regarded source of free lessons is Bootstrap, a research project based at Brown University. The International Data Science in Schools Project, a cooperative of math and computer science experts, has developed frameworks for data-science instruction, and the Concord Consortium, a nonprofit focusing on technology in STEM instruction, offers activities teachers can use, along with a free data-analysis software tool called CODAP.
Advocates of teaching data science point out that it doesn't need to wait until high school. Jo Boaler, a Stanford University math education professor who favors a bigger role for data instruction, includes an activity for 3rd graders on her website, youcubed.org.
A group of KIPP charter schools in New York City has been using Bootstrap's data-science methods in 5th grade history lessons. The children develop theories about what led to the downfall of Mayan civilization, using data to investigate various factors that could have played a role, including their diets, their human-sacrifice practices, and deforestation, said Chéla S. Wallace, the science director for KIPP-New York.
Suyen Machado, the Los Angeles math specialist who helped write the IDS curriculum, said it offers students a chance to see how they can harness math to make change in the world, even if they don't think of themselves as "math people."
"Students can create projects that facilitate civic engagement," she said. "They can use data to question what they're hearing, to see that data empowers."
A student in Rojo's class last year decided to test whether media reports about crime create a false impression of increasing danger. She analyzed crime rates in her city and neighborhood and found that violent crime had actually been declining, Rojo said.
That kind of experience can help "non-mathy students" see math—and themselves—differently, Rojo said.
"The course attracts kids who haven't had good experiences with math that has no context," he said. Math usually asks kids to "solve for x, graph things that have no meaning, or solve for variables no one cares about," he said. "In my class, half the battle is giving students the skill set to look at data, and half is restoring their faith in math."
Concerns and Barriers
Popularizing data-science instruction faces substantial challenges, however. Teachers will likely need training in computer science and data-collection methods; for the IDS curriculum, they learn that in summer institutes.
It's also important to ground data-science instruction in a topic, rather than learning those skills abstractly, in isolation, said Emmanuel Schanzer, Bootstrap's founder and co-director.
"You can't do responsible data analysis without knowing something about the subject you're analyzing," he said. That need for content expertise presents a "natural opportunity" to weave data projects into all academic subjects. But that, too, requires teacher training.
37 percent of educators say their districts offer lessons on data analysis (or analytics) only in secondary schools.
Experts also caution that any study of big data should include ethics.
"Students need to ask questions such as how should big data be used? And should some data even be collected to begin with?" Schanzer said.
One of the biggest barriers to the widespread embrace of data science as a math course lies in the college admissions office. The authors of the IDS curriculum persuaded California's two state university systems to accept the course in place of Algebra 2 for its three-years-of-math entrance requirement.
But that doesn't tackle the reverence many colleges hold for calculus on a student's transcript. As a result, students aiming for selective colleges often take it to burnish their applications.
Some math experts worry that conceptualizing data-science instruction as an alternative to a calculus pathway risks perpetuating a fundamentally flawed way of imagining math instruction.
"Once you start talking about alternate pathways to calculus, it's just code for a rigorous pathway and a weak one," said Steven Strogatz, a professor of applied mathematics at Cornell University whose 2019 book, Infinite Powers, explores the wonders of calculus. "You've got to worry about tracking, and racism, sexism, and class divisions" in the way students are counseled into math courses, he said.
Instead of setting up math study as parallel pathways ending either in statistics or calculus, Strogatz imagines a radically different approach that blends all math subjects in a "web" of student-led, inquiry-based instruction starting in kindergarten. That way, once students get halfway through high school, they've been introduced to all of math's essential concepts, and can make informed choices about whether to take higher-level math.
Vol. 39, Issue 20, Pages 20-22Published in Print: February 5, 2020, as Teaching Students to Wrangle 'Big Data'