Monday, October 22, 2012

Dealing with Outlier Polls

What is an outlier poll? Any poll that shows a dramatically different result then the consensus of all the other polls. This has often been one of the most difficult questions for prognosticators to answer and to be sure there are a wide range of approaches to the issue. Here are some of the most common ways:

1.) Do nothing since it will probably all averages out in the end. Therefore, they would just average that result with the others. The key problem I have with this system is not always are there outlier polls on both extremes, which means they don't just average out in the end.

2.) Just take away a predetermined number of points away from the poll. This begins to add a lot of potential bias to a formula. How do you determine how many points? Do you dock the same points every time? How is it fair to treat a reputable polling company with a long track record of reasonable results the same as a poll that has had a long shaky past of over polling one party. Just because both of these polls in the above scenario has outlier numbers does not mean that they should be considered the same way.

3.) Discarding outlier polls altogether. While this can be the most tempting solution; I find this answer to complicate matters all the more. Let's say for instance that a poll has had a long track record of biased or inaccurate results and they are showing a result way different then the others. By removing this poll you are offsetting bias with more bias from the other extreme. How then is it fair to draw trend lines incorporating a poll that was thrown out of the average. If you throw out this poll in one state for have outlier numbers how is it fair to keep this poll in another state for showing better numbers. The bottom line is if a poll has no relevance in one state it has no relevance in any other state.

As always consistency is the key. Whatever method that is used has to stay uniform regardless if it shows good news or bad news for your candidate. Using our system we have three different classifications of polling companies that were established before the campaign started and has never changed.

1. One group of polls are those that are reputable and has had a decent track record in the past of not showing overly biased numbers and having reasonably accurate numbers at the end. If one of these polls are within 4 points of the average they will must be weighed like they ordinarily would and not be considered an outlier. If however they are above four points from the average after including there results then you subtract this new average of all the polls from this one poll and divide this number by two. This is the amount that is then docked from the end result.

2. The second classification of polling are those that have had a more sketchy track record in the past and has also shown a consistent bias in past elections or consistently showing outlier numbers in all states in favor of one candidate in this election cycle. If one of these polls are within 3 points of the average they will be weighed like they ordinarily would and not be considered an outlier. But if there result is more then three points from the average. They are temporarily put aside while you calculate the average of the other polls from classification one (shown above) and then you once again subtract this adjusted average with this outlier poll and divide by two. This is the amount that is docked from the end result. The number docked is also known as the median number from the original poll result and the adjusted average of reputable firms.

3. The third classification of polls are very rare. They not only have a bad track record of past accuracy, but to complicate matters they can sometimes show outlier numbers for both candidates depending upon the state. If a poll is new and has bad reviews from both sides and has shown numbers outlier numbers from the norm on both sides then a new method is used. Their result is always put aside at first while all the other polls are averaged in and regardless of how close they are to the average their poll is subtracted from adjusted average and divided by two. The only poll that meets this criteria in this election cycle is Gravis Marketing.

Now let's give an examples of the end results of this system in a toss-up state like Iowa:
Rasmussen- TIED
PPP- Romney by 1
NBC-Obama by 8
We Ask America- Obama by 3
Original Average- Obama by 2
In this case the NBC poll is clearly an outlier and will be pushed aside for a moment while we adjust the numbers and average them together.
Rasmussen- Obama by 1
PPP-Romney by 2
We Ask America- Obama by 4
ARG- Obama by 1
New Average- Obama by 1
Now we subtract the NBS results of Obama by 8 with this average of Obama by 1. This makes 7 which is divided in half and makes 3 1/2. The original result of Obama by 8 becomes Obama by 4 1/2
The final polling average with this poll being included becomes Obama by 1.7. What this means is that the adjustment given to the NBC poll is changed every time a reputable firm comes out. So if the adjusted average moves more towards Romney then this NBC poll moves more towards Romney and vice versa. With the slight trending towards Romney the 1.7 margin gets reduced by .50 making the new final projection Obama by 1.2.

A similar thing happened today with the CBS/Quinnipiac poll showing Obama by 5. being poll poll is an outlier the average margin of victory only moved .3 in Obama's favor now with the Suffolk poll showing a Tied race look for even tighter margins by later today.

No comments:

Post a Comment