The challenge of Internet surveys

The Internet is having a profound effect on many things people do, and this includes surveys. Using the Internet to conduct “Web surveys” is becoming increasingly popular. Web surveys have several advantages over more traditional survey methods. It is possible to collect large amounts of survey data at lower costs than traditional methods allow. Anyone can put survey questions on dedicated sites offering free services; thus, large-scale data collection is available to almost every person with access to the Internet. Furthermore, Web surveys allow one to deliver multimedia survey content to respondents, opening up new realms of survey possibilities that would be extremely difficult to implement using traditional methods. Some argue that eventually Web surveys will replace traditional survey methods.

Although Web surveys are easy to do, they are not easy to do well. The reasons include many of the issues we have discussed in this chapter. Three major problems are voluntary response, undercoverage, and nonresponse. Voluntary response appears in several forms. Some Web surveys invite visitors to a particular website to participate in a poll. Misterpoll.com is one such example. Visitors to this site can participate in several ongoing polls, create their own poll, and respond multiple times to the same poll. Other Web surveys solicit participation through announcements in newsgroups, email invitations, and banner ads on high-traffic sites. An example is a series of 10 polls conducted by Georgia Tech University’s Graphic, Visualization, and Usability Center (GVU) in the 1990s.

77

Although misterpoll.com indicates that the surveys on the site are primarily intended for entertainment, the GVU polls appear to claim some measure of legitimacy. The website www.cc.gatech.edu/gvu/user_surveys/ states that the information from these surveys “is valued as an independent, objective view of developing Web demographics, culture, user attitudes, and usage patterns.”

A third and more sophisticated example of voluntary response occurs when the polling organization creates what it believes to be a representative panel consisting of volunteers and uses panel members as a sampling frame. A random sample is selected from this panel, and those selected are invited to participate in the poll. A very sophisticated version of this approach is used by the Harris Poll Online.

Web surveys, such as the Harris Poll Online, in which a random sample is selected from a well-defined sampling frame are reasonable when the sampling frame clearly represents some larger population or when interest is only in the members of the sampling frame. An example are Web surveys that use systematic sampling to select every nth visitor to a site and the target population is narrowly defined as visitors to the site. Another example are some Web surveys on college campuses. All students may be assigned email addresses and have Internet access. A list of these email addresses serves as the sampling frame, and a random sample is selected from this list. If the population of interest is all students at this particular college, these surveys can potentially yield very good results. Here is an example of this type of Web survey.

EXAMPLE 10 Doctors and placebos

A placebo is a dummy treatment like a salt pill that has no direct effect on a patient but may bring about a response because patients expect it to. Do academic physicians who maintain private practices sometimes give their patients placebos? A Web survey of doctors in internal medicine departments at Chicago-area medical schools was possible because almost all doctors had listed email addresses.

An email was sent to each doctor explaining the purpose of the study, promising anonymity, and giving an individual a Web link for response. Result: 45% of respondents said they sometimes use placebos in their clinical practice.

78

Several other Web survey methods have been employed to eliminate problems arising from voluntary response. One is to use the Web as one of many alternative ways to participate in the survey. The Bureau of Labor Statistics and the U.S. Census Bureau have used this method. Another method is to select random samples from panels, but instead of relying on volunteers to form the panels, members are recruited using random sampling (for example, random digit dialing). Telephone interviews can be used to collect background information, identify those with Internet access, and recruit eligible persons to the panel. If the target population is current users of the Internet, this method should also potentially yield reliable results. The Pew Research Center has employed this method.

Perhaps the most ambitious approach, and one that attempts to obtain a random sample from a more general population, is the following. Take a probability sample from the population of interest. Provide all those selected with the necessary equipment and tools to participate in subsequent Web surveys. This methodology is similar in spirit to that used for the Nielsen TV ratings. It was employed by one company, InterSurvey, several years ago, although InterSurvey is no longer in business.

Several challenges remain for those who employ Web surveys. Even though Internet and email use is growing (according to the 2012 Statistical Abstract of the United States, as of 2010, 80% of American adults aged 18 and older have Internet access at home or work, and 71% have Internet access at home), there is still the problem of undercoverage if Web surveys are used to draw conclusions about all American adults aged 18 and older. Weighting responses to correct for possible biases does not solve the problem because studies indicate that Internet users differ in many ways that traditional methods of weighting do not account for.

In addition, even if 100% of Americans had Internet access, there is no list of Internet users that we can use as a sampling frame, nor is there anything comparable to random digit dialing that can be used to draw random samples from the collection of all Internet users.

Finally, Web surveys often have very high rates of nonresponse. Methods that are used in phone and mail surveys to improve response rates can help, but they make Web surveys more expensive and difficult, offsetting some of their advantages.

79

STATISTICAL CONTROVERSIES

The Harris Online Poll

The Harris Poll Online has created an online research panel of more than 6 million volunteers. According to the Harris Poll Online website, the “panel consists of a diverse cross-section of people residing in the United States, as well as in over 200 countries around the world,” and “this multimillion member panel consists of potential respondents who have been recruited through online, telephone, mail, and in-person approaches to increase population coverage and enhance representativeness.” One can join the panel at join.harrispollonline.com.

When the Harris Poll Online conducts a survey, this panel serves as the sampling frame. A probability sample is selected from it, and statistical methods are used to weight the responses. In particular, the Harris Poll Online uses propensity score weighting, a proprietary Harris Interactive technique, which is also applied (when applicable) to adjust for respondents’ likelihood of being online. They claim that “this procedure provides added assurance of accuracy and representativeness.”

image
Blend Images/Hill Street Studios/Getty Images

Are you convinced that the Harris Poll Online provides accurate information about well-defined populations such as all American adults? Why or why not?

For more information about the Harris Poll Online, visit

www.theharrispoll.com

There is more information in a special issue of Public Opinion Quarterly, available online at http://poq.oxfordjournals.org/
content/72/5.toc