5.1 Samples and Their Populations

Almost everything worth evaluating requires a sample, from voting trends to sales patterns to the effectiveness of flu vaccines. The goal of sampling is simple: Collect a sample that represents the population. As Lillian Gilbreth reminds us, efficient living and efficient sampling are both satisfying and possible—and far easier in theory than they are in practice.

A random sample is one in which every member of the population has an equal chance of being selected into the study.

A convenience sample is one that uses participants who are readily available.

There are two main types of samples: random samples and convenience samples. A random sample is one in which every member of the population has an equal chance of being selected into the study. A convenience sample is one that uses participants who are readily available, such as college students. A random sample remains the ideal and is far more likely to lead to a representative sample, but it is usually expensive and can present a lot of practical problems. It is often almost impossible to get access to every member of the population in order to be able to choose a random sample from among them. Technologies such as Amazon Mechanical Turk, SurveyMonkey, and many other Internet tools offer new ways of obtaining convenience samples from a more diverse sample of participants.

MASTERING THE CONCEPT

5.1: There are two main types of samples in social science research. In the ideal type (a random sample), every member of the population has an equal chance of being selected to participate in a study. In the less ideal but more common type (a convenience sample), researchers use participants who are readily available.

Random Sampling

Imagine that there has recently been a traumatic mass murder in a town and that there are exactly 80 officers in the town’s police department. You have been hired to determine whether peer counseling or professional counseling is the more effective way to address the department’s concerns in the aftermath of this trauma. Unfortunately, budget constraints dictate that the sample you can recruit must be very small—just 10 people. How do you maximize the probability that 10 officers will accurately represent the larger population of 80 officers?

Here’s one way: Assign each of the 80 officers a number, from 01 to 80. Next, use a random numbers table (Table 5-1), to choose a sample of 10 police officers. Then, (a) select any point on the table; (b) decide to go across, back, up, or down to read through the numbers.

Table : TABLE 5-1. Hypothesis Testing: Hypotheses and Decisions This is a small section from a random numbers table used to randomly select participants from a population to be in a sample as well as to randomly assign participants to experimental conditions.
Excerpt from a Random Numbers Table
04493 52494 75246 33824 45862 51025 61962
00549 97654 64501 88159 96119 63896 54692
35963 15307 26898 09354 33351 35462 77974
59808 08391 45427 26842 83609 49700 46058

For example, say you begin with the 6th number of the second row of Table 5-1 and count across. The first 10 numbers read: 97654 64501. (The spaces between sets of five numbers are included solely to make it easier to read the table.) The first pair of digits is 97, but we would ignore this number because we only have 80 people in the population. The next pair is 65. The 65th police officer in the list would be chosen for the sample. The next two pairs, 46 and 45, would also be in the sample, followed by 01. If we come across a number a second time—45, for example—we ignore it, just as we would ignore 00 and anything above 80. Stick to your decision—you can’t rule out Officer Diaz because she seems to be adjusting well on her own or rule in Officer McIntyre because you think he needs the help right away.

104

Were you surprised that random sampling selected both the number 46 and the number 45? Truly random numbers often have strings of numbers that do not seem to be random. For example, notice the string of three 3’s in the third row of the table.

You can also search online for a “random numbers generator,” which is how we came up with the following numbers: 10, 23, 27, 34, 36, 67, 70, 74, 77, and 78. You might be surprised that 4 of the 10 numbers were in the 70s. Don’t be. Random numbers are truly random, even if they don’t look random.

Random samples are almost never used in the social sciences because we almost never have access to the whole population. For example, if we were interested in studying the eating behavior of voles, we would never be able to list the whole population of voles from which to then select a random sample. If we were interested in studying the effect of video games on the attention span of teenagers in the United Kingdom, we would never be able to identify all U.K. teenagers from which to choose a random sample.

Convenience Sampling

Generalizability refers to researchers’ ability to apply findings from one sample or in one context to other samples or contexts; also called external validity.

It is far more convenient (faster, easier, and cheaper) to use voles that we bought from an animal supply company or to gather teenagers from the local school—but there is a significant downside. A convenience sample might not represent the larger population. Generalizability refers to researchers’ ability to apply findings from one sample or in one context to other samples or contexts. This principle is also called external validity, and it is extremely important—why bother doing the study if it doesn’t apply to anyone?

Random Dots? True randomness often does not seem random. British artist Damien Hirst farms out the actual painting of many of his works, such as his famous dot paintings, to assistants. He provides his assistants with instructions, including to arrange the color dots randomly. One assistant painted a series of yellow dots next to each other, which led to a fight with Hirst, who said, “I told him those aren’t random…. Now I realize he was right, and I was wrong.”
Matthew Lloyd/Getty Images

105

Replication refers to the duplication of scientific results, ideally in a different context or with a sample that has different characteristics.

Fortunately, we can increase external validity through replication, the duplication of scientific results, ideally in a different context or with a sample that has different characteristics. In other words, do the study again. And again. Then ask someone else to replicate it, too. That’s the slow but trustworthy process by which science creates knowledge that is both reliable and valid. That’s also why some of the real “scientific breakthroughs” you hear about are really just the tipping point based on many smaller research discoveries.

A volunteer sample, or self-selected sample, is a special kind of convenience sample in which participants actively choose to participate in a study.

Liars’ Alert! We must be even more cautious when we use a volunteer sample (also called a self-selected sample), a convenience sample in which participants actively choose to participate in a study. Participants volunteer, or self-select, when they respond to recruitment flyers or choose to complete an online survey, such as polls that recruit people to vote for a favorite reality show contestant or college basketball team. We should be very suspicious of volunteer samples, which may be very different from a randomly selected sample. We should be cautious whenever participants volunteer. The information they provide may not represent the larger population that we are really interested in.

The Problem with a Biased Sample

Let’s be blunt: if you don’t understand sampling, then you make it easy for others to take advantage of you. For example, the colorful cosmetics catalog Lush Times uses the following testimonial about the amazing skin-rejuvenating powers of the face moisturizer called Skin’s Shangri La: “I’m nearly 60, but no one believes it, which proves Skin’s Shangri La works!” Let’s examine the flaws in this sample of “evidence”—a brief testimonial—for the supposed effectiveness of Skin’s Shangri La.

Are Testimonials Trustworthy Evidence? Does one middle-aged woman’s positive experience with Skin’s Shangri La—“I’m nearly 60, but no one believes it”—provide evidence that this moisturizer makes for younger-looking skin? Testimonials use a volunteer sample of one person, usually a biased person; moreover, you can bet that the testimonial a company uses in its advertising is the most flattering one.
Susan Nolan

The population of interest is women close to age 60. The sample is the one woman who wrote to Lush. Assuming that it is a real letter, there are two major problems. First, one person is not a trustworthy sample size. Second, this is a volunteer sample. The customer who had this experience chose to write to Lush. Was she likely to write to Lush if she did not feel very strongly about this product? Moreover, would Lush be likely to publish her statement if it weren’t positive? The moisturizer still might be effective, but this testimonial doesn’t provide evidence worth listening to.

Lush touts its products as meant for people of all ages. But with colorful, cartoon-like drawings and catchy product names such as Candy Fluff and Sonic Death Monkey, it seems likely that teens and 20-somethings are the intended consumers. A 60-year-old woman shopping at Lush may have a more youthful mind-set and appearance to begin with. Self-selection is a major problem, but all is not lost!

We could randomly assign a certain number of people to use Skin’s Shangri La and an equal number of people to use another product (or no product), and then see which group has better skin a certain number of weeks later. Which do you find more persuasive: a dubious testimonial or a well-designed experiment? If our honest answer is a dubious testimonial, then statistical reasoning once again leads us to ask a better question (nicely answered by social psychologists, by the way) about why anecdotes are sometimes more persuasive than science.

Random Assignment

Random assignment is the distinctive signature of a scientific study. Why? Because it levels the playing field when every participant has an equal chance of being assigned to any level of the independent variable. Random assignment is different from random selection. Random selection is the ideal way to gather a sample from a population; random assignment is what we do with participants once they have been recruited into a study, regardless of how they got there. Practical problems related to getting access to an entire population mean that random selection is almost never used; however, random assignment is used whenever possible—and solves many of the problems associated with a convenience sample.

106

MASTERING THE CONCEPT

5.2: Replication and random assignment to groups help overcome problems of convenience sampling. Replication involves repeating a study, ideally with different participants or in a different context, to see whether the results are consistent. With random assignment, every participant has an equal chance of being assigned to any level of the independent variable.

Random assignment involves procedures similar to those used for random selection. If a study has two levels of the independent variable, as in the study of police officers, then you would need to assign participants to one of two groups. You could decide, arbitrarily, to number the groups 0 and 1 for the “peer counseling” and “therapist counseling” groups, respectively. Then, (a) select any point on the table; (b) decide to go across, back, up, or down to read through the numbers—and remember to stick to your decision.

For example, if you began at the first number of the last row of Table 5-1 and read the numbers across, ignoring any number but 0 or 1, you would find 0010000. The first two participants would thus be in group 0, the third would be in group 1, and the next four would be in group 0. (Again, notice the seemingly nonrandom pattern and remember that it is random.)

An online random numbers generator lets us tell the computer to give us one set of 10 numbers that range from 0 to 1. We would instruct the program that the numbers should not remain unique because we want multiple 0’s and multiple 1’s. In addition, we would request that the numbers not be sorted because we want to assign participants in the order in which the numbers are generated. When we used an online random numbers generator, the 10 numbers were 1110100001. In an experiment, we usually want equal numbers in the groups. If the numbers were not exactly half 1’s and half 0’s, as they are in this case, we could decide in advance to use only the first five 1’s or the first five 0’s. Just be sure to establish your rule for random assignment ahead of time and then stick to it!

CHECK YOUR LEARNING

Reviewing the Concepts

  • Data from a sample are used to draw conclusions about the larger population.
  • In random sampling, every member of the population has an equal chance of being selected for the sample.
  • In the behavioral sciences, convenience samples are far more common than random samples.
  • In random assignment, every participant has an equal chance of being assigned to one of the experimental conditions.
  • If a study that uses random assignment is replicated in several contexts, we can start to generalize the findings.
  • Random numbers may not always appear to be all that random; there may appear to be patterns.

Clarifying the Concepts

  • 5-1 What are the risks of sampling?

Calculating the Statistics

  • 5-2 Use the excerpt from the random numbers table (Table 5-1) to select 6 people out of a sample of 80. Start by assigning each person a number from 01 to 80. Then select 6 of these people by starting in the fourth row and going across. List the numbers of the 6 people who were selected.
  • 5-3 Use the excerpt from the random numbers table (Table 5-1) to randomly assign these six people to one of two experimental conditions, numbered 0 and 1. This time, start at the top of the first column (with a 0 on top) and go down. When you get to the bottom of that column, start at the top of the second column (with a 4 on top). Using the numbers (0 and 1), list the order in which these people would be assigned to conditions.

Applying the Concepts

  • 5-4 For each of the following scenarios, state whether, from a practical standpoint, random selection could have been used. Explain your answer, including in it a description of the population to which the researcher likely wants to generalize. Then state whether random assignment could have been used, and explain your answer.
    1. A health psychologist examined whether postoperative recovery time was less among patients who received counseling prior to surgery than among those who did not.
    2. The head of a school board asked a school psychologist to examine whether children perform better in history classes if they use an online textbook as opposed to a printed textbook.
    3. A clinical psychologist studied whether people with diagnosed personality disorders were more likely to miss therapy appointments than were people without diagnosed personality disorders.

Solutions to these Check Your Learning questions can be found in Appendix D.