After elections, people often note unexpected outcomes and then complain that “the polls got it wrong.”
After Donald Trump’s stunning 2016 presidential victory, the press gave us articles on “Why the Polls were such a Disaster,” on “4 Possible Reasons the Polls Got It So Wrong,” and on “Why the Polls Missed Their Mark.” Stupid pollsters. “Even a big poll only surveys 1500 people or so out of almost 130 million voters,” we may think, “so no wonder they can’t get it right.
Moreover, consider the many pundits who, believing the polls, confidently predicted a Clinton victory. They were utterly wrong, leaving many folks shocked on election night (some elated, others depressed, with later flashbulb memories of when they realized Trump was winning).
So how could the polls, the pundits, and the prediction models have all been so wrong?
Or were they? First, we know that in a closely contested race, a representative sample of a mere 1500 people from a 130 million population will—surprisingly to many people—allow us to estimate the population preference within ~3 percent.
Sounds easy. But there’s a challenge: Most randomly contacted voters don’t respond when called. The New York Times “Upshot” recently let us view its polling in real time. This enabled us to see, for example, that it took 14,636 calls to Iowa’s fourth congressional district to produce 423 responses, among which Steve King led J. D. Scholten by 5 percent—slightly more than the 3.4 percent by which King won.
Pollsters know the likely demographic make-up of the electorate, and so can weight results from respondents of differing age, race, and gender to approximate the population. And that, despite the low response rate, allows them to do remarkably well—especially when we bear in mind that their final polls are taken ahead of the election (and cannot account for last-minute events, which may sway undecided voters). In 2016, the final polling average favored Hillary Clinton by 3.9 percent, with a 3 percent margin of error. On Election Day, she won the popular vote by 2.1 percent (and 2.9 million votes)—well within that margin of error.
To forecast a race, fivethirtyeight.com’s prediction model does more. It “takes lots of polls, performs various types of adjustments to them [based on sample size, recency, and pollster credibility], and then blends them with other kinds of empirically useful indicators” such as past results, expert assessments, and fundraising. Here is their 2016 final estimation:
Ha! This prediction, like other 2016 prediction models, failed.
Or did it? Consider a parallel. Imagine that as a basketball free-throw shooter steps to the line, I tell you that the shooter has a 71 percent free-throw average. If the shooter misses, would you disbelieve the projection? No, because, if what I’ve told you is an accurate projection, you should expect to see a miss 29 percent of the time. If the player virtually never missed, then you’d rightly doubt my data.
Likewise, if, when Nate Silver’s fivethirtyeight.com gives a candidate a 7 in 10 chance of winning and that candidate always wins, then the model is, indeed, badly flawed. Yes?
In the 2018 U.S. Congressional races, fivethirtyeight.com correctly predicted 96 percent of the outcomes. On the surface, that may look like a better result, but it’s mainly because most races were in solid Blue or Red districts and not seriously contested.
Ergo, don’t be too quick to demean the quality polls and the prediction models they inform. Survey science still works.