Last week I attended the annual meeting of the American Association of Public Opinion Research (AAPOR) in Atlanta. I had hoped that I would be able to write some reactions while I was at the conference, but it turned out I was much busier than I anticipated. There were fewer “election and polling” track sessions this year, but it turned out there was quite a few other sessions of professional interest to me (but only tangentially related to this newsletter) and I had almost no free time.1 So, this week I will write two – perhaps three – posts on things I learned last week.
One of the things I hoped to learn at AAPOR was what folks were thinking about the polling so far this year. There was not a lot of discussion about that, and, in fact, there were fewer “big name” pollsters at the conference than I expected. I think that is because we are in the middle of an election year. It’s not just that folks are busy, but that most pollsters are in competition with one another and probably do not want to be discussing what’s wrong with their own polling or divulge anything that may conflict with the interests of their sponsors, whether media or campaigns. Nevertheless, there were some interesting conversations about issues impacting current polling.
What’s causing the odd results we are seeing in sub-populations in recent polling?
This was the question I really wanted to hear people discuss last week. And it is clear that it is on the minds of just about everyone. We have seen prominent pollsters concluding that there are what would be historic shifts of Black and Hispanic voters towards Republicans. They have also reported young voters shifting away from Biden in large numbers. These sub-group behaviors are largely responsible for the tightness in the race. But is this really happening? Can we really make reliable conclusions based on this kind of sampling? Is something else going on that’s not apparent from the responses given? There is no one right or easy answer to these questions, but there are some things that may be impacting the polling.
The first problem is in the sampling itself. Sub-groups are very small samples inside samples. Consequently, the error bars on these findings are very big. This means the sampling has the potential to be very wrong and not representative of the overall sub-group population. Several people – most notably journalist Shefali Luthra in a conference-wide panel event - called for more oversampling of sub-groups so that we can better understand them.
Oversampling is when a sub-group is surveyed in larger numbers that its share of the overall population being otherwise sampled so that we can get a more representative sample of that sub-group itself. That is, when a pollster decides to sample 400 members of a sub-group that represents, say, 11% of the population in a survey of 1,000 voters. The overall sample will be weighted so that the sub-group represents 11% of the overall sample,2 but the oversample itself can be used to give the pollster a more representative sample of the sub-group. In this case, the margin of error in the oversample would be about +/- 5 points.3 If we got 110 members of the sub-group in the overall sample, as would be done with a normal sampling process, the margin of error would be +/- 9 points. But it is more likely that the survey would get something like 50 members of the sub-group and have to weight it to the correct proportion of the overall sample. The margin of error for the sub-group then would be +/- 14 points even though the error margin for the overall sample of 1,000 would the +/- 3 points.
As you can see, there are big problems associated with making conclusions from sub-group responses in normal surveys – particularly when those sub-groups are small and hard-to-reach.
It is clear that it has become very expensive to reach all types of respondents, not just the typically hard-to-reach groups. While one Republican pollster advocated continuing to do all phone interviews in polls, even he admitted it was really expensive and he had to add text-to-web to bring the costs down.4 But there is something else going on that may be impacting the results we are seeing in sub-groups – are the sub-samples missing important segments of the sub-group population?
SSRS and KFF have done some research on what surveys may be missing by not calling respondents who have pre-paid phones. I have to admit, this is something I never considered before. You might not realize it – although you will once you hear this – that to get a cell phone or landline phone contract with a phone company you need three things: an ID; a certain minimum credit score; and an address. You can probably already see the problem.
There are significant numbers of low-income and otherwise poor folks who simply cannot qualify for a regular contract phone service. Not being able to access this service, they must turn to pre-paid phones. And these phones are not in the universe of phones that are called in most, if not all, surveys – including political polling. One might assume that these folks are also low-propensity voters and missing them is just a tiny error. But we do not know that. In fact, one reason the polls were off so much in 2020 was because about 11 million more people turned out to vote than most election modelers thought was likely. We don’t really know who they were either. Including pre-paid phone users might have given us a heads-up to the increased turnout, which is why Trump got so much more support than polling suggested at the time even though he still lost.
KFF contacted pre-paid phone users in three surveys it conducted in the past year. None of these surveys were political polls. They dealt with attitudes and behaviors associated with public health, discrimination, and immigration. What they found is that the composition of the sub-groups was different when pre-paid phone users were included, and the answers to questions were not always similar to the other members of the sub-group using contract phones. In particular, they found that they got much higher response from Latinos with limited English proficiency in one survey.
To my knowledge, no one is yet applying the use of pre-paid phone users to political polling. KFF admitted that the cost of doing so is high. It was not clear to me what needs to be done to reach these folks, but it is clear that they are being excluded from surveys and that could impact what we are seeing in the current polling versus what is really going on within the different sub-groups. More research is needed on this potentially important piece of the surveying puzzle.
And a couple of my evenings were consumed by watching my beloved Boston Bruins fail to advance in another post-season series.
There may be other methods used to ensure the sub-group portion of the sample is correct in the overall sample, such as only including some respondents in the overall sample while keeping the oversampling separate.
Because ultimately the idea in polling is to understand how people will vote, I don’t think any margin of error over 5 points is helpful. Even five points is a large error margin for potentially competitive races.
Other pollsters I heard talk at the conference disagreed that only phones was the best way to reach a representative sample of the population, even if cost were not an issue. However, most agreed that phone calls should be included in any poll. However, one well-known national pollster appears completely opposed to phone calls.