Friday, November 11, 2016

What Does Early Polling Tell Us?

In the wake of Hillary Clinton's loss to Donald Trump, it's worth considering whether Bernie Sanders would have suffered the same fate. During the primary, I pointed to historical polling of party nominees during the primary season, and how well it predicted final general election results:

https://twitter.com/n8r0n74/status/705825446785515520

At the time, there were two major sets of polling data available to help assess the candidates. First, individual candidates' favorability ratings. Second, we had general election matchup polling of several potential November matchups.

In the favorability ratings, Bernie Sanders far exceeded Clinton, who noticeably exceeded the ratings of Donald Trump (June 6 ratings in parentheses):

https://elections.huffingtonpost.com/pollster/bernie-sanders-favorable-rating (+9%)
http://elections.huffingtonpost.com/pollster/hillary-clinton-favorable-rating (-14%)
http://elections.huffingtonpost.com/pollster/donald-trump-favorable-rating (-25%)

In matchup polls, the final polling conducted between Sanders and Trump (end of Sanders' realistic chances to win nomination) showed Sanders performing at an average of +10%, which is an enormous margin in modern electoral terms:

http://www.realclearpolitics.com/epolls/2016/president/us/general_election_trump_vs_sanders-5565.html

The RealClearPolitics running average for Clinton-Trump, measured at that same end date (June 6):

http://www.realclearpolitics.com/epolls/2016/president/us/general_election_trump_vs_clinton-5491.html

It's important to compare apples-to-apples while the Sanders-Clinton race was still in progress. After it ended, Clinton predictably got a large, but short-lived polling boost (as did Trump earlier in the season). Depending on how you average/smooth the data, the Clinton-Trump matchup polls gave Clinton between a +2% and +4% advantage at that time. So, conservatively, Sanders fared about 6 points better than Clinton in that future matchup.

Are all these early data meaningful? I believe so. Here is an analysis of primary season polling data as a predictor of general elections:

http://fivethirtyeight.blogs.nytimes.com/2012/04/18/do-romneys-favorability-ratings-matter/?_r=1

I draw some different conclusions than Silver, but I'm starting with the data he provides.


What this shows are the averages of favorability ratings during the primaries, and of course, the final results. In these 10 previous contests, 7 times the winner was the candidate who polled more favorably during the primaries. 7 out of 10 seems modestly, but not wildly, skillful. But, I believe the results are more useful than that, when inspected closely.

One of the 3 "failures" was the 1988 Bush-Dukakis race. That was a failure. I will only note that unlike other candidates, Dukakis had excellent net favorability, while still having low favorable ratings. In other words, during the primary season, many respondents either had a neutral, or no opinion of Dukakis. In statistical terms, we have to consider this kind of data as lower confidence data. Dukakis and Bush had identical early 34% favorable ratings.  But, by November, Dukakis's unfavorable ratings had risen from 16% to 39%. In any case, this race was a failure in terms of prediction skill.

However, the 1992 Clinton-Bush failure was different. While Clinton's early ratings were worse than Bush's:

  • Like Dukakis, many voters had not yet formed either a favorable or unfavorable view yet
  • They were only modestly worse than Bush's (-11% vs -3%)
  • Most importantly, the final race was impacted by an independent candidate winning 15% of the vote
This last point is crucial. A comparison of two early candidates cannot be expected to account for the effect of another candidate garnering 15% of the final vote, if that candidate is not siphoning exactly as many Democrat and Republican votes. It seems entirely plausible to me that Ross Perot allowed Clinton to win the 1992 election. Thus, I think we need to throw that year out, entirely, in our analysis.

The last failure was the 2004 Bush-Kerry race. But, I submit that this wasn't a failure at all, rather an indication of the closeness of the race. Early polling showed Kerry at +1% and Bush at -1%, the smallest early season difference in the last 40 years. In the end, Bush won the vote by 1%. So, while early polling did not successfully predict the winner, I think it's fair to say that it did predict the outcome, within a very modest range of uncertainty. In statistics, that needs to be the standard. Barely missing is not equally bad as missing by a large margin. I argue that 2004 should be considered a success of this predictor, with the caveat that a couple percentage points of uncertainty has to be assumed. This, then, would mean that in 8 of the last 9 elections without large 3rd-party influence, primary season favorability ratings were a good predictor of outcome.

8 out of 9, or even 8 out of 10, has to be considered very good performance. This year, the gold standard of political poll analysis, FiveThirtyEight.com, had Hillary Clinton at >70% to win the election, on the very day of the election. They ended up missing on the popular vote by a modest 2%. We don't have any means of quantifying electoral chances that's without uncertainty in the results.

That quantifiable predictors are not perfect is not a good reason to favor more qualitative analysis. It's only a good reason to be less certain about the predictions.

Questions


Q: But, if we're to believe that favorability rating is a good predictor, why didn't it predict Sanders beating Clinton?

A: I don't claim that favorability is a means by which to predict primary races. There are two unique factors affecting primaries. First, name recognition. Primary voters must choose between less well-known candidates. By November, both major party candidates are always well-known. In primaries, those with high familiarity (e.g. Clinton) have a huge advantage. This is why we often see party nominees who were "losers" in previous years' races. Secondly, at the national level, we have a close left-right split of the electorate. In that environment, I believe personal favorability has the opportunity to be a deciding factor. In primaries, there is no clear center-point of the party. No symmetry to be slightly tipped in one candidate's favor by good favorability ratings.

Q: What about 2000? Isn't that a "failure" because Gore actually won the popular vote?

A: That's an entirely fair reading. But, then, we could also consider 2016 a success because Clinton won the popular vote. In any case, I think 8 out of 10 successes is a reasonable interpretation of the history of this heuristic.

Q: But these polls were so far in advance? Wouldn't things change by November?

A: Of course, they could. But, the final polls were conducted in June, only 5 months from the election. Clinton and Trump are also well-known quantities. Their favorability ratings were very consistent throughout the election season. Sanders is the least well-known of the three. His numbers may have changed. But, the point is that they would have to change by a tremendous amount in 5 months. His (favorability/matchup) numbers weren't just better than Clinton's in June. They were much better. And, Sanders' favorability ratings did not mirror Dukakis's in 1988, where large numbers of voters expressed no favorable/unfavorable view in the polls. 

In the end, the data show that, historically, things usually don't change that much from June to November.

Friday, November 4, 2016

Sanity Checking

One of the best things about posting ideas online is exposing them to a community to be challenged. You also record a snapshot of your thinking at a given time, which can later be revisited. This should serve as an opportunity to calibrate your judgment and evaluate your analyses. Here's an opportunity to do that with a claim I made last week:

https://twitter.com/n8r0n74/status/792168797322883073


This tweet was in reference to the James Comey (FBI) announcement on 10/28 about the discovery of new emails, possibly related to the previously-closed Clinton email investigation. My tweet claimed that the announcement would surely not change polls by 1%, persistently. To be clear, I was referring to the net margin between Clinton and Trump, which at the time, stood at approximately 4%. Thus, I was claiming that the event would not cause polls to drop to 3%, favoring Clinton. It should also be clear by the "hiccups" clause that I was making no statements about the possibility for short-term effects on polls, as voters scrambled to understand the meaning of the Comey release. Granted, I offered no specific clarification about what "persistent" meant, but in the context of an election that was only 11 days away, I intended the claim to pertain to the polls as they stood before the election.

Results


As the election is still 4 days away, I actually don't think it's necessary to judge the prediction at all today. I consider less than a week still within "hiccup" territory, and many pollsters' latest results still include polling from the day of this news release, when voters may have been caught in the confusion of politicos spinning this story. Nonetheless, let's evaluate it today. I will be happy to admit failure and publicize it as such, should election day come and the prediction not have been validated.

Measuring


To assess my claim, we need a measure of the polling today, and a baseline before the Comey letter. I've consistently referred to rolling poll averages at RealClearPolitics.com, because their math is straightforward and the source data is well linked.

http://www.realclearpolitics.com/epolls/2016/president/us/general_election_trump_vs_clinton_vs_johnson_vs_stein-5952.html

I consider 4-candidate polls to be the proper measure, as in almost all 50 states, it will actually be 4 (or more) choices before voters. A 1% effect on polling can't be evaluated without the effect of 3rd-parties.

The results today (updated as new polls come in) show Clinton with a +2.4% margin. This is the easy part. The more difficult part is establishing a baseline. I anticipated I might need to back up this claim in the future, and actually tweeted about the (then) current polling immediately after the Comey letter:

https://twitter.com/n8r0n74/status/792556051904069633


following that up by noting that at that time, the RCP moving average (Clinton +3.8%) included no polling data collected after the FBI letter was announced. 


So, this is clear, right? Polls at +3.8% before the event, and +2.4% afterwards? I was wrong? Well, maybe. Again, I think the jury is still out (until Tuesday), but there's also a couple points to consider:

What is our Baseline?


10/28 is clearly reflecting data before the event. But, what about the average on 10/29? Looking closely, we see a huge jump in Trump's numbers between 10/28 and 10/29:


Is that jump due to the Comey letter? Almost certainly, mostly not. It's important to understand that RCP's moving average is not showing the sentiment of American voters on a given day. It's an average of the most recent polls that have been released before that day. Expanding the individual poll summaries below the chart, we see that RCP averages approximately 5 days' worth of polls, typically between 5 and 8 polls' worth. 


On 10/29, I believe their moving average was calculated from these 7 polls, judging by the average:


I say "I believe," because a simple average of these polls gives Clinton +2.7%. Hovering over the graph, however, shows +2.6%. So, I can't say for sure. But, it's possible that either:

  • RCP applies different weighting to polls based on sample size or margin-of-error. The two biggest polls in that group were both Clinton +1%, so that might explain a slightly lower result
  • It may be that RCP is only displaying results to the nearest percent on their webpage, and have finer-grained data to calculate the overall results. So, 2.6 vs 2.7 may be due to rounding.
But, doesn't the 10/29 moving average include the Comey event? Barely at all. Only 1 of those 7 polls continued into 10/28. That was a 6 day poll. If we assume equal spread of polling in the IBD/TIPP poll, only 169 voters could possibly have known about the Comey letter. This is a generous estimate, given that many were likely at work on Friday, or otherwise didn't hear about the event until after participating in the poll. In any case, 169 voters represents only 2% of those polled, in the 10/29 RCP moving average.  So, why the big Trump jump from 10/28 to 10/29?

Almost certainly, this is a result of a block of very good poll results for Clinton from 10/24 now being 5 days old, and no longer in the average. Dropping results of +9, +9, +9, +14, +1, and +4 for Clinton was guaranteed to significantly reduce this moving average between 10/28 and 10/29. This aspect had nothing to do with the FBI letter.

This underscores an important point: Clinton was already losing ground quickly, before the FBI letter was released. From the data, her biggest margin of +7.1% came on 10/17. By 10/28, it was down to +3.8% with zero influence from the FBI letter. By 10/29, with results including only 2% of respondents from Friday 10/28, the margin was down to +2.6%



So, what is the best baseline? Well, there's probably no perfect answer. 10/28 has no post-FBI data in it. But, it does still carry the effect of 4 really good Clinton polls (+9,+9,+9,+14). 10/29 only has about 2% of its data coming from after the announcement.

The entire 10/28-10/29 gap in margin (+3.8% - +2.6%) can be explained by the removal of the 10/24 polls. In order for the limited amount of Friday the 28th polling to account for that difference would require those 169 voters to have chosen Trump by a ratio of 4:1. That's nearly impossible. Even a 55/45 split can't be supported by the polls conducted since that date.

So, I feel comfortable with the assessment that 10/29 is actually the best baseline. With that baseline, the change in polling is now down to only 0.3%. The selection of start date literally makes the difference between my prediction looking right (well under 1%), or wrong (assuming we evaluate it today at all).

Where Did the Votes Move From


The second interesting thing in the RCP moving average data is that Clinton's total did not drop at all, whether you use 10/28 or 10/29 as the start date. Jill Stein's numbers didn't change at all. It doesn't appear that the left, broadly, had any change in opinion based on this announcement.

The entirety of the difference in polling in the last week is Gary Johnson losing ground, and Trump gaining it (mostly the latter). Are we really to believe that bad news for Hillary Clinton caused Gary Johnson voters to shift their votes to Trump?

What seems more plausible to me is that Gary Johnson has been on a continual slide for nearly 2 months (as has Stein to a lesser degree). Johnson had a bad last week: his faux marijuana heart-attack and VP Weld's semi-endorsement of Clinton both likely lost him moderate Republican followers.

TL;DR


Was I right or wrong? Maybe. I still submit that it's too early to tell. I'll revisit this prediction on election day, when I'm hopefully not celebrating a Trump victory. And, hopefully improving on my early primary prediction of Marco Rubio being the GOP nominee. 😉