obviously with any poll there is an error margin. The result was within the error margin of the polls. The bookies and the markets however were dead wrong! Before the vote, a bearish bet on volatility in VIX options paid about $20 to risk $80. The same bet on the bullish side paid about $80 to risk $20!!! I’m sure there is always a little skew baked into the VIX options due to its nature of mean reversion, but that is insane… and it was completely wrong.
^^That is a misinterpretation on your part. The polls did not indicate remain was in the lead. There is too much margin of error. The polls correctly indicated it was too close to call.
If it was a statistically random sample, 1500 - 2000 people should be enough to get +/- 2% margins of error on an up/down vote. This is a fairly standard count for polling, even in the US, where there are 300 million people (of which maybe 100 million vote in a big election).
The challenge is generating a random sample and asking questions in a way that is not leading and doing it in an environment where you are likely to get honest answers. If phones and internet are more likely to reach one side than the other on an issue (the more globalized are more likely to be connected, for example), then you have a sufficient sample but it still has selection bias.
The other challenge is to ascertain liklihood of voting. If the Yes people are much more fired up than the No people, then it can look like one thing in the polls and come out the other way in an election, which is why the get-out-the-vote campaigns are so strongly funded by candidates in tight races.
so bchad… do you mean, for example, that if the poll indicated 48% say leave… the margin of error would be 48% +/- 2% OR 48% +/- 1%
(.009 is 2% of .48)
The first condition would indicate that the polls were inconclusive since the margin of error for remain would overlap with the margin of error for leave. (given that a 3 point lead means 3% lead between reamin and leave) The second condition would indicate that remain was in the lead with statistical significance. …Or I am oversimplifying something that would require a more in debth hypothesis test to sort out.
That’s what I’m saying, I don’t think you can get a truly random sample that is representative of the overall population based on the methods that these surveys typically employ.
They actually did find a disparity between the telephone and the online polls
From what I’ve seen in polling, the margin of error usually means that there is an 80% chance that the actual result will fall within that range, For example, if 45% is the predicted result for a certain vote and the margin of error is +/- 3%, that means there is an 80% chance of the vote falling between 42% and 48%. Polls are never wrong because the result was different than what the most probable outcome was determined to be. The handicapping could be off, but that is very hard to demonstrate. People’s understanding of polling and statistics are the weak links. Time well spent by most would be learning about Monte Carlo simulations.
No way. Which fund is going to hire 500 people to troll outside all voting booths? In fact, most of these places just straight up take research from JPM or Goldman. That’s why all the hedge funds pile into the same strategies and lose money at the same time.
I’d guess that most funds lost a bunch of money of this Brexit surprise. Look how stocks were up 2% to 3% right before the vote. Investors on a whole were obviously long into the event, and they lost it as the market sold off afterwards.
In the last week there is a move towards Remain, but it is hardly definitive and several polls were showing Leave ahead. Also, after the MP Jo Cox was murdered there was probably a small percentage of Leave voters who became more shy about expressing that publicly. When it came down to actually voting, I don’t suppose too many people let the actions of a lone crazy guy sway their choice.
That is purely speculative on my part, but I think the trend to Remain over the last week may not have been real but rather a sympathetic reaction to the murder.
In any case, the polls in the last two weeks were very close with many polls showing each side in the lead.
The betting odds were way off of course. Which is more interesting.
OK, yes, you’re right. But the problem is the sampling method was biased and not (as you originally suggested) that there was an N of only 1500 for a population of 50 million.
You could have sampled 5 million people, and your result would still have been just as unreliable. On the other hand, if the sampling method were truly randomized (say, get a list of people eligible to vote and choose 1500 at random and go chase them down until every last one gave an answer), you could get a better result than if you just increased N to a larger nubmer.
It’s been a while since I’ve done any polling type work, but I believe what newspapers report as the margin of error is the size of two standard errors.
So 48% +/- 2% would mean 95% confidence of a result between 46% and 50%, assuming an unbiased sample. You’re right that this means you probably can’t rule out 50% + 1, but you probably can rule out 51% (at 95% confidence). Usually with a result like this, one would report both the 95% confidence interval, note that it can’t rule out the result going the other way, and then state the confidence interval at which you could rule out 50% (which actually would be 95% in the example here). In order to get a more conclusive view, you would have to sample a larger N, but you don’t necesarily know what the mean is going to be before you collect sample, which is why you sometimes have no choice but to conclude that you’re results are inconclusive at a standard confidence level, but still suggest it’s more likely to go one way than the other (in this case, that something like 99.9% of the 95% confidence interval is on one side of the 50/50 line)
If I remember correctly, the standard error of a proportion is Sqrt(p*(1-p)/N), or (N-1) if it’s from a sample, which it usually is. The closer you are to a 50/50 split, the higher the N you need to resolve. Solving backward, it suggests that a 2% margin of error on a 50/50 split would require a sample size of 2500 valid answers.
The interesting thing here is that the true population size doesn’t really matter as long as the sample is randomly drawn. This is one reason why most professional surveys like Gallup or Pew will have somewhere in the range of 1500-2500 interviews on up/down type questions. However, if 2500 represents a substantial fraction of the true population - like 2500 samples drawn from a total population of 3000 people, you will (not surprisingly) get a tighter margin of error. There’s another formula in that case, which I forget off the top of my head.
There was an intersting discussion I read somewhere about the Kinsey survey in the 1950s. Kinsey thought that as long as the N was really really large, the results would be reliable, so he ammassed tens of thousands, perhaps hundreds of thousands of interviews from anyone willing to give one. But there was a problem of response bias… I think the kinkier people wanted to tell Kinsey all about the stuff they did, and perhaps embellished for fun, and so even though the sample size was large, it was skewed to be extra horny. I think Master’s and Johnson found using other techniques that people’s real sex lives were a lot more boring than the Kinsey study suggested.