Nation & World

Why it’s become harder to project presidential winner on election night

Stephen Ansolabehere at the CGIS Knafel Building.

Photo by Dylan Goodman

Christy DeSmith

Harvard Staff Writer

November 1, 2024 6 min read

Elections and public opinion expert details lessons learned since 2000, rise of absentee voting

Election night 2000 represents a difficult chapter in the history of broadcast news.

Exit polls showed a tight presidential election between Vice President Al Gore and Texas Governor George W. Bush. Just before 8 p.m., NBC News projected a Gore victory in the pivotal state of Florida, with all the other major television networks closely following. Two hours later, all retracted their forecasts as Gore’s margins narrowed with the reporting of additional votes.

Just after 2 a.m., the networks felt confident calling the state — and therefore the presidency — for Bush. But two hours later, they had to backtrack again when it became clear Florida was headed for a recount.

“The media had a vulnerability in understanding the changing nature of how elections are actually run,” said Stephen Ansolabehere, Frank G. Thompson Professor of Government, reflecting on Florida’s razor’s edge results, which exposed the networks’ reliance on outdated statistical modeling.

Five years later, Ansolabehere, an elections and public opinion expert, joined a team of social science Ph.D.s charged with improvingdata journalism and election forecasting for CBS News. “I’ve been there ever since for every midterm and presidential election as well as the primaries,” he said.

We caught up with Ansolabehere, mastermind of the long-running Cooperative Election Study, for a lesson on the evolving nature of real-time vote projections and a preview of election night 2024. The interview was edited for length and clarity.

How did TV journalists get it so wrong in 2000?

That election exposed a lot of administrative failures in the U.S. electoral system — from voting machines to registration to management of polling places. But the networks relied on the same old statistical models to ingest data — and then they just put that out. They stationed journalists in various areas to report on local results, but they weren’t necessarily placed where the problems were.

There was also a rush to report. The television networks were basically racing each other. Don’t get me wrong. There was certainly an ethos of getting it right because calling the election can have a big impact — it can shift how legal strategies are pursued by the campaigns afterwards. That was very much the case in 2000 and 2020. It was almost the case in 2004.

Journalists know it’s a big responsibility. But for me, it’s also this very cool problem.

Say more about that.

The very cool problem is, you’re shown in real time little portions of something that has occurred — the election. At 7 p.m., you get to see, say, 10 percent of what has happened. At what point can you make a decision? Do you need to see 50 percent of the data? Do you need 90 percent? It’s a missing data problem. It’s a forecasting problem. And it’s forecasting things that we political scientists care about, like turnout, vote shares, and how different groups voted.

How has the rise of absentee voting complicated this work?

In 2000, one in eight ballots nationwide was cast absentee or early. In 2016, 40 percent of ballets were cast absentee or early. During COVID, more than 60 percent were. That makes it really challenging to understand what’s going on during election night. In particular, there have been a bunch of studies that say making absentee voting easier doesn’t increase turnout. But since the states made absentee voting easier, turnout has gone up a bunch.

Many will recall from 2020 that there are idiosyncrasies with how absentee ballots are reported. How has that affected your work on election night?

Our old data models were based on [electoral] precincts. There are about 180,000 precincts in the U.S. — each a tiny place, with about 1,000 people — and we could see in each one how things had shifted from one election to the next.

But absentee ballots are not reported at the precinct level. They’re usually reported at the county level. There are 3,000 counties in the U.S., and they are very heterogeneous. That makes it very difficult to understand which little pieces of the missing or obscured puzzle have been revealed as the votes are reported.

Absentee voting was also politicized in 2020. How did that affect your team?

Up until 2020, we were lucky because absentee voting was pretty much like in-precinct voting. That is, it was unrelated to how people voted. But when we started getting the data streams in 2020, Biden would be up by something like 20 points in the absentee ballots, because that’s usually what gets dumped first on election night. You’ll see zero precincts reporting, but 20 percent of the votes are in — that’s the absentees from the county.

The problem was particularly vexing out west, because there you have these big urban counties that count for something like 75 percent of the state’s population. Think Maricopa County where Phoenix is in Arizona or Clark County with Las Vegas County in Nevada. We were, on the fly, trying to partition the data based not on the presidential vote. We could see that, say, the round of absentee votes that came in at 10:05 EST p.m. also had a reported vote from a certain state legislative district. That gave us some information about what part of Maricopa County the votes were from.

When did you know the results of the 2020 presidential election?

By about 1:30 a.m., I knew Biden had won.

But it took days for the media, including CBS, to declare a winner.

Everyone on the Decision Desk team was just racking their brains through midnight — can we figure out anything about where these absentee ballots are coming from? CBS staffer Kabir Khanna and I finally hit on a model where we could understand not so much what data we already had, but what data we didn’t yet have — how Democratic or Republican were the areas that hadn’t reported their absentee counts. And given how things were trending, what that must mean for the outstanding ballots.

At 1:30 it became obvious to me, at least, that Biden would win Arizona, Georgia, Pennsylvania, Wisconsin, Michigan, and Nevada.

What will you keep your eye on when you’re back at the Decision Desk?

One of the things we’re starting to look at right now is what’s the absentee ballot yield — that is, how many people have returned their absentee ballots — by party registration. A lot of states, not all, report party registration. But almost all of the states where the race will be close do.

And on Election Night itself, the first reporting we get is the 5 p.m. exit poll. If the Democrats aren’t up by five [points] in the raw exit-poll data, I’m going to guess it’ll be bad night for them. That’s what happened in 2016. The exit-poll data came back with Hillary Clinton up by five.

Why would the exit polls favor Democrats?

Raw exit-poll data always overstate the Democrats by five. We’ve studied it. We found things to explain little parts of it, but nobody really knows why. It’s been that way since the 1970s.

Sections

Featured Topics

Featured series

Wondering

Explore the Gazette

Read the latest

How to end polarization? Schools may be best hope.

Crush your goals the Ohtani way

How academia can help America heal