If these micropolling results are valid, Obama's in trouble in Pennsylvania:
These were conducted Oct. 23,24,25
Bucks County: O: 49 M: 43 2004 Results: K: 51 B: 48
Allegheny: O: 52 M: 42 2004 Results: K: 57 B: 42
Erie: O: 50 M: 43 2004 Results: K: 54 B: 45
York: M: 57 O: 39 2004 Results: B: 63 K: 35
Montgomery: O: 51 M: 39 2004 Results: K: 55 B: 44
John Kerry took Pennsylvania in 2004, but only by a narrow margin--51 to Bush's 49 percent. But these polls indicate that Obama isn't doing as well as Kerry did, except in York County (which seems to be going from red to blue). And between Murtha and the NRA, he's probably going to lose big in rural western Pennsylvania. Now maybe he can make it up in Philly, but Rendell might have to bring out the dead voters.
Sample size?
Pennsylvania has been polled like crazy. In the last month McCain has been hard pressed to climb above 40% while Obama has been steady around 50%.
Maybe your micro-poller knows something all the pollsters don't.
The pre-election polls in PA at this stage look much better than they did for Kerry. Not just by a point or two but a solid 8-10 points.
And yet, there is the fact that Obama is Black which is a very hard pill to swallow in Appalachia. If Obama loses, I think Appalachia will be a dominant force in achieving this.
Also, note that McCain has not led in a single poll of PA (see RCP) since April.
So, one could argue that if McCain wins PA, this may mean a McCain sweep with over 300 electoral votes overall.
It's possible - PA is supposedly very hard to poll. There must be a reason McCain thinks PA is ripe for the picking. A Maverick State for the Team of Mavericks?
I've no idea what the sample size is, or where they came from, or if there is any validity to them. That's why I said "if." But with the polls all over the map this year, more than usual, it's hard to rely on any of them. The McCain campaign obviously thinks they have a good chance there, so maybe they know something that the pollsters don't. I continue to believe that they overestimate support for Obama by overweighting Democrats, even though a lot of "Democrats" only became so as part of Operation Chaos last spring.
Holy crap. I completely forgot about Operation Chaos. There's a little monkey wrench to the polls.
Oops. I said monkey. Does that make me racist?
Of course, if McCain is spending money in PA that does not prove that he thinks he can win it. It could just mean that he has money to spend & nowhere else to go. Which is a depressing thought.
But these polls indicate that Obama isn't doing as well as Kerry did, except in York County (which seems to be going from red to blue).
Of course, McCain is also doing worse than Bush too (except Allegheny, where it's the same). It seems there are a lot more undecideds this time around.
I wonder if the Dow's 889-point rally is a sign that McCain's odds are improving.
I know if I had a big honkin stock portfolio I'd be thinking strongly about the Caymans if I thought Obama might win.
I've read that the party weights they use on State Polls derive, e.g. Rasmussen from 2004, 2006 data.
If that's the case, I would think Operation Chaos would result in a negative poll vote for Obama at this point. For example if the poll is looking for (or weighting to achieve) a sample having 40% Dem and 35% Repub, that 40% will now include some Repubs who switched during Operation Chaos. Clearly these folks are not going to tell the pollster that they are voting for Obama nor are they going to lie about their registration. Let's say this number is 4% (exaggerated perhaps). Then these are 4% of voters in the sample who the pollster thinks are Dems, but actually aren't. So if they were replaced by actual Dems (pre-Operation Chaos), the likelihood for more Obama voters in the sample increases. And in fact since these registration switchers subtracted from the Repub pool, the Repub sample number is overweighted!
Seems to me, the polls should be adjusted upwards for Obama based on Operation Chaos.
Why do you think Operation Chaos should do the reverse and add to McCain's poll numbers? Someone explain my flawed logic please...
It's not your logic that's flawed, Gerson, it's your understanding of poll weighting methodology. What they do is ask for whom people are voting and ask about party membership and leaning tendencies. Then they readjust the raw numbers so that the party membership percentages they just heard match the party membership numbers in the last election (or those in whatever model they're using, e.g. the Make Sure Obama Comes Out On Top fake-but-accurate model).
So let's say these are the raw party membership numbers:
party membership: 35% D, 30% R, 35% I
And let's just assume to make our math easy that all the D's say they'll vote for Obama, all the R's for McCain, and the I's split 70/30 for McCain. The raw election result numbers are therefore:
Obama 46%, McCain 54%
But now they adjust these raw numbers so the party membership breakdown matches their model, which says party membership next week will actually be 45% D, 25% R, and 30% I. That requires multiplying the D numbers by 45/35 = 1.29, multiplying the R numbers by 25/30 = 0.83, and the I numbers by 0.86. When you do all that, the poll results are:
Obama 54%, McCain 46%
Hey presto! You've turned a McCain sweep into an Obama blowout.
Now, you'd think that the way a poll is conducted is simple: you ring up 4000 people and ask them how they'll vote, then tot it all up and calculate percentages. You'd be wrong. They do all kinds of fiddling with the numbers, and this business of renormalizing with respect to party membership is the fishiest. It's worth while keeping in mind that polling is a business, and that means they're selling something. They're not selling the truth to the voters -- because the voters aren't paying them. So what are they selling? And to whom? Keep those questions in mind when you contemplate a poll, any poll.
As for the "Operation Chaos" business Rand mentions: I'm skeptical, myself, because I don't believe it was ever a big effect. And, even if it was, they are likely using R/D numbers from the last full election (2006), not the Democratic primary. To the extent Chaos can modify things, it can only be because they decide, based on the number of Democratic primary voters, and/or polls they took that measured party membership during the primary season, that the D numbers should be particularly high this fall. But they'd get that from the 2006 election anyway, not to mention their own wishes, so, again, I doubt the size of the effect.
There's also the "likely voter" business, where they decide what fraction of those raw numbers are actually going to turn up at the polls. In the past they have been fairly cynical and relied only partly on what people say they'll do, and partly on whether they say they voted before, and how often. But Gallup, for one, has a new model in which they just take people's word for it. In that model Obama has a much larger lead, presumably meaning he's got a lot of people who swear they're going to vote for him, despite having neglected to ever vote before. Hardly surprising.
Finally, sample size means absolutely dick in these things. The sample is pretty much always big enough to eliminate statistical fluctuations if you're sampling is truly random and representative. And if it's not random or representative, then it wouldn't matter if it were 100 times as large.
They do all kinds of fiddling with the numbers, and this business of renormalizing with respect to party membership is the fishiest. - Carl
If the party membership numbers are accurate, it's hard to see why this is a bad thing. Let's take a 1000 people sample. Well, the adjustment to fit a typical set is to look at the the 1000 and adjust it so it is more typical. That is a check on whether the 1000 was sufficiently large enough to capture the random sample you actually want. The Dems in your sample could well have been undersampled. There is a choice between greatly increasing your sample size and adjusting a less expensive sample to fit desired randomness properties that are known a-priori. Right?
In any case, the Operation Chaos argument I made should still hold - in that it is undercounting Obama's support.
Now I do think that there may be a different problem as that the people less prone to answer a pollsters call are predominantly Republican - pissed off about the country, don't like the media etc. So yeah, I'm not convinced Obama can win unless he has margins bigger than 7-8 in a battleground state. Interestingly his national fall off the last few days does not seem to be refelected in the state polls.
I think we will learn a lot from the final results. ;-)
If the party membership numbers are accurate, it's hard to see why this is a bad thing.
That's because it comes out the way you want it to. You're reasoning backward from the conclusion you like to judge the accuracy of the method. Not very scientific.
Let's take a 1000 people sample. Well, the adjustment to fit a typical set is to look at the the 1000 and adjust it so it is more typical.
Bzzt. Your unexamined assumption is that you know what is 'typical.' But you don't. You're guessing. And, furthermore, you're guessing in a way, to the objective observer, that seems a bit weird. Rather than regard the party membership percentages you get from the poll as 'typical,' you're using the party membership percentages in the last election, two years ago, as 'typical.' What kind of sense does that make? We've been hearing all year that party membership numbers have been shifting like crazy this year. So how are the numbers from 2006 more 'typical' than the numbers from the poll conducted right now?
That is a check on whether the 1000 was sufficiently large enough to capture the random sample you actually want.
Hardly. This is a red herring. There's no doubt at all, from purely statistical grounds, that a sample of 1000 is sufficiently large provided that it is truly selected randomly and representatively, meaning you don't have any selection bias, e.g. you only call during the day M-F, which is going to pick up mostly suburban housewives, or you only call Saturday morning, which is going to disproportionately pick up young single folks who can sleep in, or you don't call cell phones, and so forth.
And what if you do select nonrepresentatively? Well, I would argue you're screwed, and an attempt to "correct" the problem by using the very crude device of fudging the numbers so that the party membership percentages match those in the last election is just plain old made-up bullshit.
I think a poll should just report the plain results of the poll, period. If you're worried about sampling representatively, then improve your sampling techniques. But this massaging of the data afterward, "correcting it," to my mind throws the reliability and integrity of the whole process into serious question.
I think we will learn a lot from the final results
No you won't. You'll match the poll that's closest to the actual outcome and leap to the wholly unjustified conclusion that that poll's methodology must've been the most accurate. Which is utter nonsense. You might as well ask four people to predict the weather next week using tea leaves, reading the entrails of chickens, or the I Ching -- and then take as the most credible of those methods whoever gets closer to the actual result. Correlation, as we say in the stats business, is not causation. Just because pollster X got closest to the actual numbers means pretty much zip. Now, if he gets closest to the actual numbers again and again, over time, maybe he's onto something. But I've not yet heard anyone talk that way. They just talk about who was closest this time, or that, without realizing that only a consistent winning trend over time can possibly be interpreted as indicating a better methodology.