Making statistics not just palatable, but delicious

5 min read

New class puts statistics in the middle of real life (where it’s been all along)

Money, love, health, innocence or guilt — even finding the right wine. Who doesn’t want to know more?

“Real-Life Statistics: Your Chance for Happiness (or Misery),” offered this semester by Harvard’s Department of Statistics, will explore the critical tools to make good judgments in matters large and small.

To do that, the new course will “demonstrate the use of statistics without [students] actually learning lots of formulas,” said department chair Xiao-Li Meng, chief architect and teacher of what is formally known as Statistics 105.

The first-time offering is meant to appeal to the statistics novice (only one prerequisite is required). It also might inspire others to join a field Meng calls “underappreciated” (and whose practitioners he says are much sought-after on Wall Street and elsewhere).

“This could be their last statistics course, or almost their first statistics course,” he said of prospective students. He acknowledged in particular a majority of students who “just get scared” when confronted with the science of chi-square tests, correlation coefficients, and regression analysis.

In practical terms, every student reading technical papers or even just the newspaper, said Meng, “should know what arguments are scientifically and statistically sound.”

But why not learn by having fun? he asked. Statistics 105 was put together over two years with the help of what Meng calls his “Happy Team” of graduate students. It uses five modules of inquiry drawn from the worlds of finance, romance, medicine, legal judgments, and food choice, in this case finding the best chocolate.

“Statistics is about making decisions in the real world,” said Happy Team member Yves Chretien, an M.D./Ph.D. student midway through Harvard Medical School and taking time out to finish his doctorate in statistics.

On Jan. 30, about 100 students filed into the cavernous Science Center B for the opening class of Statistics 105. In a joke-filled and lively 90 minutes, Meng gave them a taste of all five topics.

“Everybody wants money,” he said of the finance section, led by a chart illustrating Wall Street profit trends, and how to mine it for useful data. “And once you have money, you want romance.”

Meng led the students through lessons on polling for data in the world of online romance, explaining the science of query populations and of question design. “The dating world,” he said, “is full of questions we would all love answers to.”

The students helped get a few answers by responding to in-class survey questions using handheld clickers linked to Meng’s laptop — a first use of the interactive technology for a Harvard statistics department course. “Real-time feedback,” said Chretien, “real-time data.”

In online romance terms, what would you like on a first date, Meng asked: a person who plays “hard to get,” or one who is “clearly into” you? The clicker data was strong: 4-to-1 in favor of a date clearly into them.

But can you generalize from a data set drawn from a room full of Harvard students in roughly the same age group? No, said Meng. A good survey requires knowing the people whose opinions you are collecting.

In the world of health and medicine, he outlined the complexity of judgment needed to draw a conclusion from clinical trials — “a huge industry,” he said, “and exceedingly complicated.”

As an example, Meng cited a study that compared two treatments used for kidney stones. It showed the persuasive allure of quantitative evidence, as well as the way a nonrigorous examination of numbers can backfire.

At first glance, Treatment B seemed to claim the highest numbers of good outcomes — 83 percent to 78 percent. But behind the numbers, by way of something statisticians call Simpson’s Paradox, Treatment A clearly trumped B. Meng’s lessons: Look for additional data and know what questions were asked to arrive at each set of numbers.

In a parallel way, statistics have a special allure in courts of law, where lawyers may use quantitative data in ways that seduce the untrained mind. “Statistics is always about presenting evidence,” said Meng, and listeners have to be aware of the pitfalls.

As a case study, he used a real-life example of a man accused of rape based on DNA found at the crime scene. Chretien and the course’s other teaching fellow, Kari Lock, role-played lawyers for the prosecution and defense, basing their evidence on the same DNA evidence. Clickers in hand, just over half the students voted to convict. The informed answer? As with others, Meng promised it would come later in the semester.

Toward the end of class, Statistics 105 took a lighter turn: lessons on probability and confidence intervals based on chocolates that may or may not contain champagne. (With lunchtime drawing near, everyone got some samples.)

Meng collected several sets of student-clicker data, based on questions that increasingly revealed more information. “How you ask a question,” he said of designing statistical queries, “is very important.”

The more information there is packed into a question, the more that polled opinions shift gradually closer to the truth. In a first clicker survey, none of the students reported that their chocolate (chosen at random) contained champagne; by the third survey, 17 percent reported they tasted champagne in their samples. (The real answer — 33 percent — should have emerged, since one-third of the chocolates were laced with champagne, said Meng.)

Class ended on a light note, too, with a real-life lesson from a “Forrest Gump” video clip. “Life is like a box of chocolates,” the Tom Hanks title character offers. “You never know what you’re going to get.”

That may be true, said Meng later. But with a grasp of elementary statistics, he added, “you can estimate what you are going to get — especially after taking Stat 105.”