8.2 Comparing Two Proportions

436

Because comparative studies are so common, we often want to compare the proportions of two groups (such as men and women) that have some characteristic. We call the two groups being compared Population 1 and Population 2 and the two population proportions of “successes” and . The data consist of two independent SRSs. The sample sizes are for Population 1 and for Population 2. The proportion of successes in each sample estimates the corresponding population proportion. Here is the notation we will use in this section:

Population Population
proportion
Sample
size
Count of
successes
Sample
proportion
1
2

To compare the two unknown population proportions, start with the observed difference between the two sample proportions,

When both sample sizes are sufficiently large, the sampling distribution of the difference is approximately Normal. What are the mean and the standard deviation of ? Each of the two 's has the mean and standard deviation given in the box on pages 418419. Because the two samples are independent, the two 's are also independent. We can apply the rules for means and variances of sums of random variables. Here is the result, which is summarized in Figure 8.5.

Sampling Distribution of

Choose independent SRSs of sizes and from two populations with proportions and of successes. Let be the difference between the two sample proportions of successes. Then

  • As both sample sizes increase, the sampling distribution of becomes approximately Normal.
  • The mean of the sampling distribution is .
  • The standard deviation of the sampling distribution is

image
Figure 8.5: FIGURE 8.5 The sampling distribution of the difference between two sample proportions is approximately Normal. The mean and standard deviation are found from the two population proportions of successes, and .

437

Apply Your Knowledge

Question 8.49

8.49 Rules for means and variances.

Suppose , , , and . Find the mean and the standard deviation of the sampling distribution of .

8.49

.

Question 8.50

8.50 Effect of the sample sizes.

Suppose , , , and .

  1. Find the mean and the standard deviation of the sampling distribution of .
  2. The sample sizes here are four times as large as those in the previous exercise, while the population proportions are the same. Compare the results for this exercise with those that you found in the previous exercise. What is the effect of multiplying the sample sizes by 4?

Question 8.51

8.51 Rules for means and variances.

It is quite easy to verify the mean and standard deviation of the difference .

  1. What are the means and standard deviations of the two sample proportions and ? (Look at the box on page 256 if you need to review this.)
  2. Use the addition rule for means of random variables: what is the mean of ?
  3. The two samples are independent. Use the addition rule for variances of random variables to find the variance of .

8.51

(a)

(b)

(c)

Large-sample confidence intervals for a difference in proportions

The large-sample estimate of the difference in two proportions is the corresponding difference in sample proportions . To obtain a confidence interval for the difference, we once again replace the unknown parameters in the standard deviation by estimates to obtain an estimated standard deviation, or standard error. Here is the confidence interval we want.

Confidence Interval for Comparing Two Proportions

Choose an SRS of size from a large population having proportion of successes and an independent SRS of size from another population having proportion of successes.

The large-sample estimate of the difference in proportions is

The standard error of the difference is

and the margin of error for confidence level C is

where is the value for the standard Normal density curve with area C between and . The large-sample level C confidence interval for is

Use this method when the number of successes and the number of failures in each of the samples are at least 10.

438

CASE 8.3 Social Media in the Supply Chain

image

In addition to traditional marketing strategies, marketing through social media has assumed an increasingly important component of the supply chain. This is particularly true for relatively small companies that do not have large marketing budgets. One study of Austrian food and beverage companies compared the use of audio/video sharing through social media by large and small companies.13 Companies were classified as small or large based on whether their annual sales were greater than or less than 135 million euros. We use company size as the explanatory variable. It is categorical with two possible values. Media is the response variable with values Yes for the companies who use audio/visual sharing on social media in their supply chain, and No if they do not.

Here is a summary of the data. We let denote the count of the number of companies that use audio/visual sharing.

Size
1 (small companies) 178 150 0.8427
2 (large companies) 52 27 0.5192

The study in Case 8.3 suggests that smaller companies are more likely to use audio/visual sharing through social media than are large companies. Let's explore this possibility using a confidence interval.

EXAMPLE 8.8 Small Companies versus Large Companies

CASE 8.3 First, we find the estimate of the difference:

Next, we calculate the standard error:

For 95% confidence, we use , so the margin of error is

The large-sample 95% confidence interval is

With 95% confidence, we can say that the difference in the proportions is between 0.18 and 0.47. Alternatively, we can report that the percent usage of audio/ visual sharing through social media by smaller companies is about 32% higher than the percent for large companies, with a 95% margin of error of 15%.

JMP and Minitab for Example 8.8 appear in Figure 8.6. Note that JMP uses a different approximation than the one that we studied and that is used by Minitab. Other statistical packages provide output that is similar.

In surveys such as this, small companies and large companies typically are not sampled separately. The respondents to a single sample of companies are classified after the fact as small or large. The sample sizes are then random and reflect the characteristics of the population sampled. Two-sample significance tests and confidence intervals are still approximately correct in this situation, even though the two sample sizes were not fixed in advance.

439

image
Figure 8.6: FIGURE 8.6 JMP and Minitab outputs, Example 8.8: (a) JMP; (b) Minitab.

In Example 8.8, we chose small companies to be the first population. Had we chosen large companies as the first population, the estimate of the difference would be negative (−0.3235). Because it is easier to discuss positive numbers, we generally choose the first population to be the one with the higher proportion. The choice does not affect the substance of the analysis. It does make it easier to communicate the results.

Apply Your Knowledge

Question 8.52

8.52 Gender and commercial preference.

A study was designed to compare two energy drink commercials. Each participant was shown the commercials in random order and was asked to select the better one. Commercial A was selected by 44 out of 100 women and 79 out of 140 men. Give an estimate of the difference in gender proportions that favored Commercial A. Also construct a large-sample 95% confidence interval for this difference.

440

Question 8.53

8.53 Gender and commercial preference, revisited.

Refer to Exercise 8.52. Construct a 95% confidence interval for the difference in proportions that favor Commercial B. Explain how you could have obtained these results from the calculations you did in Exercise 8.52.

8.53

(−0.003, 0.252). We can just reverse the sign of the interval in the previous exercise.

Plus four confidence intervals for a difference in proportions

Just as in the case of estimating a single proportion, a small modification of the sample proportions greatly improves the confidence intervals.14 The confidence intervals will be approximately the same as the confidence intervals when the criteria using those intervals are satisfied. When the criteria are not met, the plus four intervals will still be valid when both sample sizes are at least five and the confidence level is 90%, 95%, or 99%.

As before, we first add two successes and two failures to the actual data, dividing them equally between the two samples. That is, add one success and one failure to each sample. Note that we have added 2 to and to . We then perform the calculations for the procedure with the modified data. As in the case of a single sample, we use the term Wilson estimates for the estimates produced in this way.

Wilson estimates

In Example 8.8, we had , and . For the plus four procedure, we would use , and .

Apply Your Knowledge

Question 8.54

8.54 Social media and the supply chain using plus four.

Refer to Example 8.8 (page 438), where we computed a 95% confidence interval for the difference in the proportions of small companies and large companies that use audio/visual sharing through social media as part of their supply chain. Redo the computations using the plus four method, and compare your results with those obtained in Example 8.8.

Question 8.55

8.55 Social media and the supply chain using plus four.

Refer to the previous exercise and to Example 8.8. Suppose that the sample sizes were smaller but that the proportions remained approximately the same. Specifically, assume that 17 out of 20 small companies used social media and 13 out of 25 large companies used social media. Compute the plus four interval for 95% confidence. Then, compute the corresponding interval and compare the results.

8.55

The plus-four interval is (0.052, 0.548). The interval is (0.079, 0.581). The intervals are somewhat different when the sample sizes are small.

Question 8.56

8.56 Gender and commercial preference.

Refer to Exercises 8.52 and 8.53, where you analyzed data about gender and the preference for one of two commercials. The study also asked the same subjects to give a preference for two other commercials, C and D. Suppose that 92 women preferred Commercial C and that 120 men preferred Commercial C.

  1. The confidence interval for comparing two proportions should not be used for these data. Why?
  2. Compute the plus four confidence interval for the difference in proportions.

Significance tests

Although we prefer to compare two proportions by giving a confidence interval for the difference between the two population proportions, it is sometimes useful to test the null hypothesis that the two population proportions are the same.

441

We standardize by subtracting its mean and then dividing by its standard deviation

If and are large, the standardized difference is approximately . To get a confidence interval, we used sample estimates in place of the unknown population proportions and in the expression for . Although this approach would lead to a valid significance test, we follow the more common practice of replacing the unknown with an estimate that takes into account the null hypothesis that . If these two proportions are equal, we can view all the data as coming from a single population. Let denote the common value of and . The standard deviation of is then

The subscript on reminds us that this is the standard deviation under the special condition that the two populations share a common proportion of successes.

We estimate the common value of by the overall proportion of successes in the two samples:

This estimate of is called the pooled estimate because it combines, or pools, the information from two independent samples.

pooled estimate of

To estimate the standard deviation of , substitute for in the expression for . The result is a standard error for under the condition that the null hypothesis is true. The test statistic uses this standard error to standardize the difference between the two sample proportions.

Significance Tests for Comparing Two Proportions

Choose an SRS of size from a large population having proportion of successes and an independent SRS of size from another population having proportion of successes. To test the hypothesis

compute the statistic

where the pooled standard error is

based on the pooled estimate of the common proportion of successes

442

In terms of a standard Normal random variable , the -value for a test of against

image

image

image

Use this test when the number of successes and the number of failures in each of the samples are at least five.

EXAMPLE 8.9 Social Media in the Supply Chain

CASE 8.3 Example 8.8 (page 438) analyzes data on the use of audio/visual sharing through social media by small and large companies. Are the proportions of social media users the same for the two types of companies? Here is the data summary:

Size
1 (small companies) 178 150 0.8427
2 (large companies) 52 27 0.5192

The sample proportions are certainly quite different, but we need a significance test to verify that the difference is too large to easily result from the role of chance in choosing the sample. Formally, we compare the proportions of social media users in the two populations (small companies and large companies) by testing the hypotheses

The pooled estimate of the common value of is

This is just the proportion of label users in the entire sample.

First, we compute the standard error

and then we use this in the calculation of the test statistic

The difference in the sample proportions is almost five standard deviations away from zero. The -value is . In Table A, the largest entry we have is with . So, . Therefore, we can conclude that . Our report: 84% of small companies use audio/visual sharing through social media versus 52% of large companies; the difference is statistically significant .

443

Figure 8.7 gives the JMP and Minitab outputs for Example 8.9. Carefully examine the output to find all the important pieces that you would need to report the results of the analysis and to draw a conclusion. Note that the slight differences in results is due to the use of different approximations.

Some experts would expect the usage of social media would be greater for small companies than for large companies because small companies do not have the resources for large expensive marketing efforts. These experts might choose the one-sided alternative . The -value would be half of the value obtained for the two-sided test. Because the statistic is so large, this distinction is of no practical importance.

image
Figure 8.7: FIGURE 8.7 JMP and Minitab outputs, Example 8.9: (a) JMP; (b) Minitab.

444

Apply Your Knowledge

Question 8.57

8.57 Gender and commercial preference

Refer to Exercise 8.52 (page 439), which compared women and men with regard to their preference for one of two commercials.

  1. State appropriate null and alternative hypotheses for this setting. Give a justification for your choice.
  2. Use the data given in Exercise 8.52 (page 439) to perform a two-sided significance test. Give the test statistic and the -value.
  3. Summarize the results of your significance test.

8.57

(a) . Not knowing anything about the two commercials, there is no reason to believe men or women will prefer Commercial A more, so the test should be two-sided. (b) . (c) The data do not show evidence of a difference between women and men concerning preference of Commercial A.

Question 8.58

8.58 What about preference for Commercial B

Refer to Exercise 8.53 (page 440), where we changed the roles of the two commercials in our analysis. Answer the questions given in the previous exercise for the data altered in this way. Describe the results of the change.

Choosing a sample size for two sample proportions

In Section 8.1, we studied methods for determining the sample size using two settings. First, we used the margin of error for a confidence interval for a single proportion as the criterion for choosing (page 427). Second, we used the power of the significance test for a single proportion as the determining factor (page 429). We follow the same approach here for comparing two proportions.

Use the margin of error

Recall that the large-sample estimate of the difference in proportions is

the standard error of the difference is

and the margin of error for confidence level C is

where is the value for the standard Normal density curve with area C between and .

For a single proportion, we picked guesses for the true proportion and computed the margins of error for various choices of . We can display the results in a table, as in Example 8.6 (page 428), or in a graph, as in Exercise 8.45 (page 435).

Sample Size for Desired Margin of Error

The level C confidence interval for a difference in two proportions will have a margin of error approximately equal to a specified value when the sample size for each of the two proportions is

Here is the critical value for confidence C, and and are guessed values for and , the proportions of successes in the future sample.

445

The margin of error will be less than or equal to if and are chosen to be 0.5. The common sample size required is then given by

Note that to use the confidence interval that is based on the Normal approximation, we still require that the number of successes and the number of failures in each of the samples are at least 10.

EXAMPLE 8.10 Confidence Interval-Based Sample Sizes for Preferences of Women and Men

Consider the setting in Exercise 8.52 (page 439), where we compared the preferences of women and men for two commercials. Suppose we want to do a study in which we perform a similar comparison using a 95% confidence interval that will have a margin of error of 0.1 or less. What should we choose for our sample size? Using and in our formula, we have

We would include 192 women and 192 men in our study.

Note that we have rounded the calculated value, 192.08, down because it is very close to 192. The normal procedure would be to round the calculated value up to the next larger integer.

Apply Your Knowledge

Question 8.59

8.59 What would the margin of error be?

Consider the setting in Example 8.10.

  1. Compute the margins of error for and for each of the following scenarios: , ; , ; and , .
  2. If you think that any of these scenarios is likely to fit your study, should you reconsider your choice of and ? Explain your answer.

8.59

(a) 0.28. 0.27. 0.26. (b) Yes, under all three conditions, the margin of error is much larger than the desired 0.1 as given in Example 8.10.

Use the power of the significance test

When we studied using power to compute the sample size needed for a significance test for a single proportion, we used software. We will do the same for the significance test for comparing two proportions.

Some software allows us to consider significance tests that are a little more general than the version we studied in this section. Specifically, we used the null hypothesis , which we can rewrite as . The generalization allows us to use values different from zero in the alternative way of writing . Therefore, we write for the null hypothesis, and we will need to specify for the significance test that we studied.

Here is a summary of the inputs needed for software to perform the calculations:

EXAMPLE 8.11 Sample Sizes for Preferences of Women and Men

Refer to Example 8.10, where we used the margin of error to find the sample sizes for comparing the preferences of women and men for two commercials. Let's find the sample sizes required for a significance test that the two proportions who prefer Commercial A are equal () using a two-sided alternative with and , , and 80% (0.80) power. Outputs from JMP and Minitab are given in Figure 8.8. We need women and men for our study.

image
Figure 8.8: FIGURE 8.8 JMP and Minitab outputs, Example 8.11: (a) JMP; (b) Minitab.

447

Note that the Minitab output (Figure 8.8(b)) gives the power curve for different alternatives. All of these have , which Minitab calls the “Comparison p,” while varies from 0.3 to 0.9. We see that the power is essentially 100% (1) at these extremes. It is 0.05, the type I error, at , which corresponds to the null hypothesis.

Apply Your Knowledge

Question 8.60

8.60 Find the sample sizes.

Consider the setting in Example 8.11. Change to 0.85 and to 0.90. Find the required sample sizes.

BEYOND THE BASICS: Relative Risk

In Example 8.8 (page 438), we compared the proportions of small and large companies with respect to their use of audio/visual sharing through social media giving a confidence interval for the difference of proportions. Alternatively, we might choose to make this comparison by giving the ratio of the two proportions. This ratio is often called the relative risk (RR). A relative risk of 1 means that the proportions and are equal. Confidence intervals for relative risk apply the principles that we have studied, but the details are somewhat complicated. Fortunately, we can leave the details to software and concentrate on interpreting and communicating the results.

relative risk

EXAMPLE 8.12 Relative Risk for Social Media in the Supply Chain

CASE 8.3 The following table summarizes the data on the proportions of social media use for small and large companies:

Size
1 (small companies) 178 150 0.8427
2 (large companies) 52 27 0.5192

The relative risk for this sample is

Confidence intervals for the relative risk in the entire population of shoppers are based on this sample relative risk. Figure 8.9 gives output from JMP. Our summary: small companies are about 1.62 times as likely to use audio/visual sharing through social media as part of their supply chain as large companies; the 95% confidence interval is (1.24, 2.12).

image
Figure 8.9: FIGURE 8.9 JMP output, Example 8.12.

448

In Example 8.12, the confidence interval is clearly not symmetric about the estimate; that is, 1.62 is much closer to 1.24 than it is to 2.12. This is true, in general, for confidence intervals for relative risk.

Relative risk, comparing proportions by a ratio rather than by a difference, is particularly useful when the proportions are small. This way of describing results is often used for epidemiology and medical studies.