Someone pointed me to this post in the Monkey Cage, a political science blog that I participate in.
The post was about non-representativeness of political polls, and it had one good point and one bad point. Overall I think the claims in the post were overstated.
Before getting into the details I’ll copy out the key points of the post:
The 2020 election will have the highest share of nonwhite voters ever seen in a presidential election year. But we’re not seeing that in the polls. Current polling methods don’t accurately sample minority voters. That failure particularly skews our understanding of the Democratic electorate . . .
In 2016, the Democratic primary electorate was 45 percent nonwhite, according to the American National Election Studies. In 2020, it should be 49 percent nonwhite. . . .
Between 2014 and 2018, people of color grew their share of the electorate by 3 percent. This proportion increased even more among Democrats, suggesting that assessments of exit polling arguably underestimate that increase. If so, any national poll’s sample of Democratic primary voters today should be about half white and half people of color. . . .
That leaves pollsters with two accuracy problems: surveying enough nonwhites and ensuring that sample represents the full array of people of color. They don’t. For instance, none of the qualifying polls has offered the survey in any Asian languages — even though Asian Americans are expected to make up 7 percent of Democratic voters nationally. And with Latinos making up an estimated 17 percent of Democratic voters, all qualifying polls should be available in Spanish. . . .
Many 2020 polls are conducted only in English or offer Spanish callbacks rather than immediate options — which results in sampling too few Spanish-speaking voters. As a result, too many of the “mainstream” polls significantly oversample college-educated and higher-income minority voters simply because they’re easier to reach than those with lower socioeconomic status. . . .
Using polls from the RealClearPolitics polling aggregator, I compared the benchmark demographics of the Democratic electorate — 51 percent white, 25 percent black, 17 percent Latino, 7 percent Asian American and Pacific Islander (AAPI) — with the demographics in recently released polls. For the benchmarks, I examined the racial composition of the general electorate and the Democratic electorate for the past 12 years, looking at census, exit poll and ANES data. Unfortunately, few pollsters reveal their complete racial demographics. But among those that provide demographic data, every one included too many white voters and too few minorities.
For example, a Fox News poll included a sample of probable Democratic primary voters that was 66 percent white and 34 percent nonwhite. Polls by Economist/YouGov, Emerson and CNN also had samples in which whites were 62 percent or more of likely Democratic primary voters. In a December Monmouth University poll, a full 71 percent of Democrats interviewed were white. Worse, samples are often so small that results aren’t even presented for black, Latino or AAPI voters — as in a Fox News poll that reports blacks and Latinos as “N/A.”
If we re-weight polls using the Democratic electorate’s expected racial composition, we can estimate how much racial sampling bias skews candidate support.
There are three issues in the above post.
1. Lots of the details seem iffy to me. If you go to different sources (exit polls, ANES, etc.), you’ll find different estimates of demographic composition (see here for more discussion of this point). That’s fine: it’s hard to actually estimate who votes. But, given that, the above post is way to over-certain in its presentation of numbers, for example in claiming “51 percent white, 25 percent black, 17 percent Latino, 7 percent Asian American and Pacific Islander” as “the benchmark demographics of the Democratic electorate.”
And I had difficulty tracking down lots of these values. For example, recall this bit:
Between 2014 and 2018, people of color grew their share of the electorate by 3 percent. This proportion increased even more among Democrats, suggesting that assessments of exit polling arguably underestimate that increase. If so, any national poll’s sample of Democratic primary voters today should be about half white and half people of color. . .
I followed the link to the 2018 exit poll and found this:
So, what percent of Democrats are white, black, etc? We can figure it out by estimating that 0.44*0.72 of respondents are white Democrats, 0.90*0.11 are black Democrats, etc., thus the percentage of Democrats who are white are 0.44*0.72/(0.44*0.72 + 0.90*0.11 + 0.69*0.11 + 0.77*0.03 + 0.54*0.03) = 0.60, and similarly we compute 19% black, 14% Latino, 7% other. So I don’t know how they take those exit polls and say that Democratic primary voters “should be about half white and half people of color.” Maybe they’re right—primary voters are not the same population as general election voters—I just don’t see where their numbers are coming from. At the very least, these numbers are a lot more uncertain than claimed in the post.
2. The post makes one big mistake by not distinguishing between weighted and unweighted numbers. For example, they write that 71% of Democrats in the Monmouth poll were white, and they link to this image:
But check this out, from this very same poll report:
It’s the weighted, not the unweighted, results that are relevant. It’s completely misleading to claim a polling bias based on unadjusted numbers.
3. But adjustments aren’t perfect. The above post claims, “too many of the ‘mainstream’ polls significantly oversample college-educated and higher-income minority voters simply because they’re easier to reach than those with lower socioeconomic status.” That could be—if the surveys don’t appropriately adjust for education and income. Not adjusting for demographics can cause problems: it’s our understanding that the infamous state polling errors in the 2016 general election arose (see figure 2 here) because many polls in close states did not adjust for education. So, yeah, if you think this a concern, you should include these variables in the adjustment.
In short, polls are inevitably flawed, hence the need for adjustment. The challenge is knowing what to adjust to. Pollsters know about adjustment. So, if there’s a problem, it’s not simply that there are too many white people in the survey. There’s room for honest disagreement regarding the demographic composition of the electorate.
P.S. The Monkey Cage post now has a long correction. But the correction still seems to miss the point that the relevant comparison is to weighted, not unweighted, numbers.