The thing about this is that these people have been picked as closely as possible to be a random sample. Ideally, they use dice or something like them to pick which people’s numbers they call; practically, they can’t always get it perfectly random — what if you’re calling a Texas football town and the high school game is that night? — but they do try.
Finally, though, they have their sample, and it includes enough people according to their rules. But: because they have been picked randomly, they almost certainly don’t exactly represent the real population.
Think about it — I have 100 million people, and I can only pick 1,000 of them. There are lots and lots of ways they could pick that random set (lots and lots: the interested reader should go to Wolfram Alpha and type in “100 million choose 1000” to see how many.) And almost none of them will really be a perfect sample.
What they have instead may give them 63 percent Democrats and 32 percent Republicans.
Now, here’s where the pollster’s special magic comes in. Through mathematical methods, they can manipulate these numbers for different populations. So they create a mathematical model of the population, and they adjust the raw results to match that model. In an election poll like the ones we’re talking about, that is called a turnout model, and it comes down to saying that you expect there to be, say, 37 percent Democrats, 32 percent Republicans, and 31 percent independents.
So, the polling company takes their actual raw data, and they fit it to that turnout model, and that’s what they present as the result of this poll.
Those results, however, depend on how well the sample matches the real population and depend on how well their turnout model fits what people actually do on that longed-for day when we actually have the election and can stop fretting about polling companies.
So with all that in mind, now let’s think about how to read a poll.