I posted at the end of December about our do-it-ourself NH political poll based on roadside election signs. So how did we do compared to the actual outcome? See for yourself:
| Candidate |
% Signs |
% Locations |
% Votes |
| Mitt Romney |
33.6 |
26.2 |
39.3 |
| Rick Perry |
16.4 |
20.0 |
0.7 |
| Ron Paul |
16.4 |
18.5 |
22.9 |
| Newt Gingrich |
15.6 |
16.9 |
9.4 |
| Jon Huntsman |
12.3 |
15.4 |
16.9 |
| Rick Santorum |
5.7 |
3.1 |
9.4 |
| Michele Bachmann |
0 |
0 |
0.1 |
| Total |
100% |
100% |
98.7% |
We didn’t do great, especially in terms of Perry. But I had noted at the time that I was skeptical of the Perry numbers, since the sign placement had looked suspiciously like there had been just one Perry supporter who started off with a car full of signs and left them every mile or so, since they were almost all in a 10-mile stretch of the same road — as contrasted with the fairly random scattering of the other candidates’ signs.
So let’s drop Perry out of the analysis entirely. (Why not; the rate he’s going, he may as well drop out of the race, too.) Here are the graphed results, with best-fit lines drawn in for how well the % of total signs and the % of distinct sign locations matched up with the % of actual votes.

R² for the correlation between total signs and actual vote is 0.89, and R² for the correlation between distinct sign locations and actual vote is 0.77. I’m a little surprised; I would have expected the results would correlate more closely with number of locations, since there’s more effort involved by supporters to put 2 signs in 2 different places then 2 signs in the same place. However, if you’re trying to conduct a similar poll of your own — and especially if you have 2 teenagers to keep occupied on a long car drive while you do it — it’s probably worth collecting data both ways.
I kind of feel like I should object to this on the basis of the R^2 test you’re using for the correlation not really being appropriate (essentially, the problem is that since all of the observations have to add up to 1, that’s one problem, and since R^2 treats being 0.01 off on an observation of 0.01 the same as being 0.01 off on an observation of 0.5, that’s the second big one). But I mostly think it’s awesome to see chemists plotting regression lines on data generated by teenagers. So I’ll shut up and be delighted.
dan, I knew you’d chime in if I used R^2 for this! Yeah, I know it’s not a terribly appropriate metric to use (although I hadn’t realized that everything adding up to 1 was a problem — that’s way outside my chemist-math realm.) It did seem like it might be meaningful enough to use for comparing the two data sets to each other. But maybe not.
More importantly, M was pretty impressed to see me chart it all in Excel. I think he needs to learn the graphing and formula-calculating tools in Excel, and school doesn’t seem to cover Excel at all, so this was a good opportunity to do some at home.