Today, we’re going to perform a little thought experiment. Einstein was big on them, so they have an honorable history. I want you to imagine a portable cement mixer — my Gods, they have them on Amazon! The one at the link holds 3.5 cubic feet of mix, a nice handy size for, say, pouring a pad for a barbecue or mortar for a flagstone walk. But instead of filling it with Portland cement and gravel, we’re going to fill it with a cubic foot of red BBs and another cubic foot of blue BBs. (It turns out that there are about 585,000 BBs to the cubic foot, so that makes a nice number.)

We switch it on, and it makes a horrible noise as all those BBs tumble about. After a few minutes, we figure the BBs are thoroughly mixed, and we have a bit of a headache.

The thing is, we were kind of sloppy filling the buckets. Just about then, the boss walks by and wants to know the proportions of red and blue BBs.

“Well, roughly half and half.”

“Roughly isn’t good enough — I want a good estimate, and I want to know how much error there may be.”* *

*Now* he tells us.

This is basically the problem that pollsters always have. They have a pretty good idea of how many voters there are in a state or in the whole country; what they want to know is the number of Republican voters versus the number of Democrat voters. (Notice how I artistically chose red and blue BBs?)

Now, it happens we know statistics, even if we’re not great at measuring BBs, so we know we can estimate the proportions by just counting a relative few of the BBs, as long as we assume they really are well-mixed, and that we take a sample randomly. In fact, it turns out that as long as the number of things we’re looking at is big and our sample is small, we can determine the margin of error just from the size of the sample. I’m not going to explain why here — thank me later — but you can read the Wikipedia article on it if you’re so inclined. The fact is, though, that for a 95 percent confidence level — the same one chance in 20 of being wrong we’ve talked about in previous polling articles — the margin of error for a sample of size *n* will be just about *1/√n*. So if we choose only 10 BBs, the margin of error is just about plus or minus 30 percent. Not very satisfactory. Let’s look at some other values:

Sample size | Margin of error (pct) |

10 | 30 |

100 | 10 |

500 | 4 |

1000 | 3 |

5000 | 1.4 |

10000 | 1 |

By the way, this rule works for any random sample, whether it’s BBs, babies born in Baghdad, or Biden supporters, as long as the size of the whole population is really large.

We figure we can get away with a three percent margin of error, so we dip up a cupful of BBs, and being careful not to pick one color over the other, we count out 1,000 of the BBs, and discover there are 530 red BBs versus 470 blue BBs. So we can report to the Boss that there are 53 percent red, plus or minus 3 percent.

If we repeat this experiment 20 times — we should hire an intern for that — we should get somewhere between 500 and 560 red BBs about 19 of those 20 times.

Of course, this is really just what a polling company does: they ask some number of people who they plan to vote for, and count the results. If they say the margin of error is 3 percent, that means they talked to roughly a thousand people. Every time. This also explains why the margin of error is usually between, say, 2 and 4 percent: a reasonable sample size is somewhere between 500 and 2000.

Now, the polling companies could get narrower margins of error, but if you look back at the table, you can see that to reduce the margin from 3 percent to 1 percent you have to ask *ten times* as many people their opinion. When you figure that every polling call costs between $2 and $20, you can see that can get expensive quickly.

But now, let’s extend our thought experiment — sounds much more impressive in German, by the way, “*Gedankenexperiment*” — and assume that the paint on the red BBs wasn’t completely dry. It’s a little sticky, so when we’re mixing the BBs in the cement mixer they tend to stick to each other and to the walls of the mixer bucket. A lot of the red BBs are taken out of circulation. That means when we dig out a handful of BBs, they’re not really well-mixed, and our sample isn’t really random. We get more blue BBs because the red ones are stuck to the mixer.

Now, we count out our 1000 BBs, but we get 490 red BBs and 510 blue BBs. This is what is called, in statistics, a *systematic error* or *systematic bias*. Here, the term bias doesn’t mean there is some conspiracy — it’s just that there is something about our method that systematically undercounts the red BBs.

So that’s the end of our experiment, and I’ll leave it as an exercise to decide why we’re mixing colorful BBs in our cement mixer, and how we’re going to get the stuck-on red BBs out before we return the mixer to the rental place. Instead, let’s look at Zombie’s recent piece. He (she? It? Does it matter to a zombie?) pointed out some interesting facts. Polling companies are now reporting that their telephone results are working out like this:

38% could not be reached

53% were contacted but actively refused to answer

9% cooperated and answered the polling questions

In round figures, that means out of 10 phone numbers tried, 4 don’t answer at all, 5 answer and hang up, and only about 1 out of 10 actually cooperates.

Think about our BBs. We only needed 40 red BBs out of a thousand to “refuse to be counted” to reverse out little “poll” completely. And here, we’re seeing that by the time our pollsters have counted 1,000 BBs, 9,000 of them refuse to be counted.

What effect does this have on the polls? It’s hard to say: the one thing we don’t know is what the uncounted people think. But the one thing we can be pretty certain of is that the polls, with these kinds of conditions, are nearly useless.

## Join the conversation as a VIP Member