Historical data on how the reliability of polling data depends on time remaining until a US presidential election

Question

Just for entertainment, I wrote a program that simulates the US presidential election of 2020. (I was interested in seeing whether, for example, predictit's relatively low probability of Biden's winning the election could be reconciled with its relatively high probability of his winning Pennsylvania.) The program assumes that between now and election day, voting in each state will change by some amount. This change is modeled as the sum of two normal (Gaussian) random variables, one of which is nationwide and the other per-state. This question is about how big to make the first variable's standard deviation, which I notate as A.
I've found data showing that historically, state polling from a week or two before election day has a mean absolute error of about 5% in predicting the national popular vote, which is quite small. However, it's only July now, and traditionally most voters would not have paid much attention at all to a presidential election this early in an election year. Therefore you would expect the correct value of A to be much higher. That is, there is plenty of time for events to occur and for people to make up their minds or change their minds.
Can anyone point me to any historical data on how big these fluctuations are likely to be when x amount of time remains until election day? (The 2020 election may be different, because Trump is so polarizing and so many people have already made up their minds about him, but that would be a different topic. Right now I'm just looking for historical data.)

user5526 · Answer

I did some more web searching, and I found some information that seems to answer my own question.
This site has polling errors for the popular vote in the last 10 presidential elections. Twelve months out, the average absolute error is about 12%, 4 months out is 9%, and in the final month 2.5%.
If you restrict attention only to more recent elections (1996-2016), then you get a very different picture. The average absolute error is somewhere around 5%, and it doesn't go down much except when you get very close to the election. They don't comment on why this is. It could be that partisanship has gotten stronger, or that pollsters have gotten better at their jobs, or it could just be a coincidence that the the last 6 elections behaved this way. When the error is as small as 5%, you're mainly measuring polling error, not true shifts in voters' opinions.
I'm actually interested in state voting, not the popular vote, so this is not exactly what I needed. However, it seems reasonable to roughly double these numbers for state polling, since, as described in the link in the question, state polling errors in the final month are typically about 5% (smaller for states that have been well polled).

Historical data on how the reliability of polling data depends on time remaining until a US presidential election

One Answer

Add your own answers!

Ask a Question