Analyzing Exit Poll Results

It’s important to state up front what data will be collected, what analysis to perform on that data and what constitutes evidence of a serious problem versus random errors in any poll of this nature.

Polling stations allow voters access to the results after the polls are closed when the votes have been tallied.  In Sedgwick County, they will have separate reports printed for the electronic voting machines and the scanned paper ballots.

We will also get a count of the number of provisional ballots collected from the polling location.  Theseballots will not be opened until the voter’s registration is verified and there will never be an official tally of the provision votes for polling location.  But we can look at the results we have for voters who submitted provisional ballots and compare them with the votes that were counted at the polling location.  If there are significant differences, this is evidence of the voter suppression effect of Kris Kobachs voter registration rules.

I have created a general data collection and analysis EXCEL spreadsheet.  Multiple precincts vote at each polling location and the results are reported for each precinct, not the polling location, so I’ve set up a spreadsheet to sum the numbers up and compute the appropriate probabilities.

I will be customizing this EXCEL file for each exit poll location in Kansas, but I am happy to share a general version of this worksheet with anyone who is interested in running an exit poll for their own area.  All you will have to do is input the official results and your exit poll results.    This is an example of of the output.

 

Example Data Analysis
Presidential race Chi-Squared Result:  NA
Candidates Exit Poll Official Results Binomial Probability
Clinton (D) 52 60 0.0638
Trump (R) 38 30 0.0530
Johnson (L) 6 6 0.5593
Stein (G) 2 2 0.5967
Other 2 2 0.5967
Total 100 100

There are two different analyses than can be used in this situation.  The chi-square test will give an exact probability that the actual results differed from what would be expected under the assumption of random chance.  EXCEL has this test as a built in function: CHISQ.TEST.  But the chi-squared test has minimum data requirements which were not met in this example, hence “NA” or Not Applicable as the result of this test.

Since the chi-squared test will not work for every set of possible data, I also show the individual binomial probabilities for each of the candidate.  The minimum probability from this set of five computations is a reasonable approximation to the exact computation using the extension of the binomial distribution and can be easily computed using built-in Excel formula BINOM.DIST.

How to interpret this:

We judge the probability of machine manipulation of the vote by evaluating the probability of our results assuming no manipulation of votes is occurring.  This is referred to as the “null hypothesis”.  All probabilities shown are made under this assumption.  If this probability is above 0.05 (5%), we can reasonably conclude that the differences between the machine vote share and the exit poll vote share are typical of random variation due to the normal errors in the process.

If this value lies between 0.05 and 0.001, raise an eyebrow and give the numbers for that race a little extra scrutiny and consider it in concert with the other exit polling results.

If this values lies below 0.001, that is evidence of fraud.  Personally, I would like to see a recount of any race with results that fall this far from normal.  But only a candidate can request a recount in Kansas.

In this example, I have contrived to show Trump with a questionably low # of votes in the official count compared to the exit poll results.  Hillary has a slightly elevated value.  But these results are not unexpected as the minimum probability of results this far off is above 0.05.

But if the other sites have similar values and they are all benefitting the same candidate, it would be concerning.  If 2 or 3 sites out of 5 show the same beneficiary of the differences, that’s reasonable.  But if 5 out of 5 sites show the same beneficiary, it’s evidence of rigging.

If we see multiple races with low odds and the same slate of candidates are benefiting, we have solid evidence of machine manipulation of our official votes.  If we see only the normal expected errors, then we have solid evidence it is NOT being manipulated.

While a single location and a single race might show evidence of manipulation, savvy cheaters will try to avoid this method of detection by establishing a maximum shift that falls beneath the 0.05 probability results.  But looking at multiple races and sites, we can establish whether even small shifts show evidence of cheating.

We can define a slate of candidates by party and check the probability of getting the results we got using a similar binomial analysis.  Under the null hypothesis of no manipulation, the probability of an error that benefits a candidate is 50%.   There are three races with candidates and five judges we are asking about, for a total of 8 results for each polling location.  Governor Brownback would like to see 4 of the 5 judges lose their jobs and replace them.  We can also presume he supports the Republican Party candidates for President, Senate and Representative.

We will have data from 5 different locations for a total of 40 random samples with approximately 50% probability.  (For example, let X be the number of errors that were the opposite of the Brownback administration preferred candidates. If we have 40 random samples as defined above, the probability of getting errors in the opposite direction of his preferred result is computed with the following excel formula:  BINOMDIST(X, 40, .5, 1)

If this value is extremely low (less than .001), we conclude that the Republican Party has unduly benefited and further investigation would be appropriate.

How to interpret the Provisional Ballot Data:

We cannot know the final count of the provisional ballots collected at a polling location.  They are polled at the county level and only those that are shown to be registered voters are opened and counted.   What we can do is compare the results of the provisional ballots with the other responses to our exit poll.   If there is a major difference between those asked to fill out provisional ballots with the automatically counted votes, we have a measure of the effect of the voter ID laws and if it made a difference to the outcome.

For each race, we can use the chi-square test if we have sufficient data.   Otherwise, we can use the binomial approximation similar to the one used to compare the official count to the exit poll survey results.

electioneering and instructs them regarding what they can and cannot do.  While permission is not required to run an exit poll, we do need permission from the property owner to set up a booth to collect our ballots and provide chairs and shade for our volunteers.  Mainly, we want everyone to know what we are doing to avoid any issues arising on on election day.

How to Run an Exit Poll Part 1

How to Run an Exit Poll Part 2

Creating an Exit Poll Ballot

Creating an Exit Poll Ballot

This is part 3 of my “How-to-Run-an-Exit-Poll-Series.

The exit poll survey ballot is important, but not complicated. The only question of interest, other than their ballot choices, is the method of voting.   Data will be available at the end of the day with separate totals for the machine cast votes and the scanned paper votes.  There will be no official count of provisional votes at this station, so we can only compare those votes to overall total for the polling station. But that comparison allows us to evaluate whether the giving people provisional votes amounts to a voter suppression tactic.

Since space on our survey form is at a premium, and because including that information makes their response less anonymous, I do not recommend including questions about age, race or gender.  Generally speaking, you want to keep the words to minimum.  (Not an easy task for me.)

Here is an example survey I have developed for exit polls in Sedgwick Co.  I included a short paragraph at the top because I feel it’s important to let people know why you want this information and reassure them that results are anonymous, just like their vote.

Sample Exit Poll Survey

The first question is really too long, but I wanted to be as clear as I can about this question.  In Sedgwick County Kansas there are three possible options:  A vote cast via electronic voting machine, a paper ballot that the voter feeds into a scanner for on-site electronic counting or a provisional ballot – a paper ballot that is sealed into an envelop to be counted later (maybe).

Asking about the specific races is straightforward.  State the office and then list the candidates.  Circling answers reduces the need for a blank or box to check.  It saves space on the page.

Staggering the answers for questions with more than one line of answers (ex: Pres) makes it easier to discern the voters intent.  When they are stacked one above another, the answer may easily become ambiguous.

Since a single polling location will have multiple precincts voting there, it’s a problem asking about races where different precincts will be voting  for different candidates.   Generally, I want to confine the questions to races that will appear on every ballot at the polling location.   On the other hand, my site managers for the SW Wichita location are very interested in the county commissioner races.  We arrived at the following:

Who did you vote for your County Commissioner Race? (Select one for District 2 OR  District 3)   –  sw-wichita-nov-8-exit-poll-ballot

I have hopes that we won’t get too many voters identifying their choices for both district 2 and district 3, but I expect we will get some.   OTOH, it’s the only question that would be spoiled and I’m reasonably comfortable in assuming that such mishaps are equally likely to occur regardless of which candidate they support.  I think we will get good data from this exit poll.

How to Run an Exit Poll Part 1

How to Run an Exit Poll Part 2