10.3 Inference for Two Independent Proportions

OBJECTIVES By the end of this section, I will be able to …

  1. Perform and interpret tests for .
  2. Compute and interpret intervals for .
  3. Use intervals for to perform two-tailed tests.

1 Independent Sample Tests for

So far in this chapter, we have learned how to perform inference about population means. In this section, we learn how to perform hypothesis tests and construct confidence intervals about the difference between two population proportions. Recall that the sample proportion of success is the ratio of the number of successes to the number of trials in a binomial experiment.

607

In this section, we consider two independent samples, each of which yields a sample proportion: and . For example, a recent survey found the sample proportion of males (sample 1) and females (sample 2) who agree that “technological changes will lead toward a future where people's lives are mostly better” to be

and

(See Example 15 for further details about these data.) Here, we are interested in performing inference for the difference in population proportions , such as the difference in the proportions of all males and females who think technological change will lead to a better future. We use the difference in sample proportions as our point estimate of the difference in population proportions , which is unknown. And just as in earlier sections where we investigated the sampling distribution of to perform inference on , here we use the sampling distribution of to help us perform inference about .

Developing Your Statistical Sense

Independent Samples Only

The inferential methods of this section are reserved for independent samples only. An example of a problem that would not use the methods of this section is the following: In the latest poll, suppose 45% of the respondents supported the Democratic candidate and 45% supported the Republican one. Because each respondent had to choose between the Democratic candidate and the Republican candidate, their respective poll numbers are not independent.

The distribution of all possible values of is called the sampling distribution of , with mean and standard error

Let and denote the number of successes, and let and denote the number of failures in sample 1 and sample 2, respectively. The sampling distribution of is approximately normal when the number of successes and the number of failures in each sample are each at least 5, that is, when , , , and . Let and .

Sampling Distribution of

When two random samples are drawn independently from two populations, then the quantity

has an approximately standard normal distribution when the following conditions are satisfied:

and where and represent the sample proportion and sample size of the sample taken from population 1 with population proportion and represent the sample proportion and sample size of the sample taken from population 2 with population proportion ; and and .

608

The three possible forms for the test for are as follows:

: > Right-tailed test
: < Left-tailed test
: Two-tailed test

The null hypothesis asserts that : . We denote this common population proportion as . The null hypothesis is assumed true, so the test statistic takes the following form:

The common population proportion is unknown, so we estimate it using the following pooled estimate of :

Note: As a check on your arithmetic, must also lie between and .

Substituting this into the formula for the test statistic gives

measures the distance between the sample proportions. Extreme values of indicate evidence against the null hypothesis.

Hypothesis Test for the Difference in Two Population Proportions: Critical-Value Method

Suppose we have two independent random samples taken from two populations with population proportions and , and the required conditions are met: , , , and () ≥ 5.

  • Step 1 State the hypotheses.

    Use one of the forms from Table 12 (page 609). State the meaning of and .

  • Step 2 Find and state the rejection rule.

    Use Table 12 on page 609.

  • Step 3 Calculate

    where

    follows an approximately standard normal distribution if the required conditions are satisfied.

  • Step 4 State the conclusion and the interpretation.

    Compare with .

609

Table 10.48: Table 12 Critical regions and rejection rules for test for
image

EXAMPLE 15 test for using the critical-value method

image

In April 2014, the Pew Research Center published a report called U.S. Views of Technology and the Future,15 in which the results of a survey of Americans' views on the future of technology were examined. Among other questions, respondents were asked whether they agreed that “technological changes will lead toward a future where people's lives are mostly better.” The results are shown in Table 13. Assume the samples are independent.

Table 10.49: Table 13 Proportions of males and females who agree that technological change will lead to a better future
Males Females
Number agreeing
Sample size
Sample proportion
Population proportion
  1. Find the point estimate of the difference in the population proportions of males and females, .
  2. Compute the pooled estimate of the common proportion, .
  3. Calculate the value of the test statistic .
  4. Check whether the conditions for performing the test for are met.
  5. Test whether the population proportion of males who agree that technology will lead to a better future is greater than the population proportion of females who agree. Use the critical-value method at level of significance .

610

Solution

  1. The point estimate is
  2. image
    Figure 10.16: FIGURE 16 TI-83/84 results.
  3. We check the conditions for performing the test for . We have: , , , and . We may thus proceed with the hypothesis test.
  4. The test for follows the steps below.
  • Step 1 State the hypotheses.

    The key words “greater than,” together with the fact that sample 1 represents the males, indicate that we have a right-tailed test:

    where and represent the population proportion of males and females, respectively, who agree that technology will lead to a better future.

    image
    Figure 10.17: FIGURE 17 is extreme, leading to rejection of .
  • Step 2 Find and state the rejection rule.

    For a right-tailed test with level of significance , Table 12 gives us and our rejection rule: Reject if .

  • Step 3 Calculate .

    From (c), we have (also see Figure 16).

  • Step 4 State the conclusion and the interpretation.

    ; therefore, reject (see Figure 17). There is evidence at level of significance that the population proportion of males who agree that technology will lead to a better future is greater than the population proportion of females who agree.

NOW YOU CAN DO

Exercises 5–8.

We may also use the -value method to perform the test for .

Hypothesis Test for the Difference in Two Population Proportions: -value Method

Suppose we have two independent random samples taken from two populations with population proportions and , and the required conditions are met: , () ≥ 5, , and () ≥ 5.

  • Step 1 State the hypotheses and the rejection rule.

    Use one of the forms from Table 12. State the meaning of and . The rejection rule is: Reject if the -value .

  • Step 2 calculate .

    where . If the required conditions are satisfied, follows an approximately standard normal distribution.

  • Step 3 Find the -value.

    Either use technology or calculate the -value using one of the forms in Table 14.

  • Step 4 State the conclusion and the interpretation.

    Compare the -value with .

611

Table 10.50: Table 14 -Values for test for
image

EXAMPLE 16 test for using the -value method

image

The General Social Survey tracks trends in American society through annual surveys. Married respondents were asked to characterize their feelings about being married. The results are shown here in a crosstabulation with gender. Test the hypothesis that the proportion of females who report being very happily married is smaller than the proportion of males who report being very happily married. Use the -value method with level of significance .

marriage

Very happy Pretty happy/
Not too happy
Total
Female 257 166 423
Male 242 124 366
Total 499 290 789

Solution

From the crosstabulation, we assemble the statistics in Table 15 for the independent random samples of men and women.

Table 10.52: Table 15 Sample statistics of very happily married respondents
Sample size Number very
happy
Sample proportion very happy
Females (sample 1)
Males (sample 2)

We first check whether the conditions for the test are valid: , , , and . We can therefore proceed.

  • Step 1 State the hypotheses and the rejection rule.

    We are interested in whether the proportion of females who report being very happily married is smaller than that of males and because the females represent sample 1, the hypotheses are

    612

    where and represent the population proportions of all females and males, respectively, who report being very happily married. We will reject if the .

    image
    Figure 10.18: FIGURE 18 -Value for left-tailed test.
  • Step 2 Find .

    First, use the data from Table 15 to find the value of .

    Then

  • Step 3 Find the -value.

    Because it is a left-tailed test, the -value is given by Table 14 as , as shown in Figure 18. This amounts to a Case 1 problem from Table 8 in Chapter 6 on page 357:

  • Step 4 State the conclusion and the interpretation.

    The is not less than or equal to , so we do not reject . There is insufficient evidence that the proportion of females who report being very happily married is smaller than the proportion of males who do so.

Note: When the -value is close to , many data analysts prefer to simply assess the strength of evidence against the null hypothesis using criteria such as those given in Table 6 in Chapter 9 (page 514).

NOW YOU CAN DO

Exercises 9–12.

2 Independent Sample Interval for

We have learned how to perform tests for . Next, we learn how to use sample statistics to estimate using a confidence interval.

Confidence Interval for

For two independent random samples taken from two populations with population proportions and , a confidence interval for is given by

where and represent the sample proportion and sample size of the sample taken from population 1 with population proportion ; and represent the sample proportion and sample size of the sample taken from population 2 with population proportion ; and , and the samples are drawn independently; and the following conditions are satisfied: , , , and .

Margin of Error

The margin of error for a confidence interval for is given by

EXAMPLE 17 confidence interval for

Use the sample statistics from Example 15 to do the following:

  1. Calculate and interpret the margin of error for confidence level 99%.
  2. Construct and interpret a 99% confidence interval for .

613

Solution

The conditions for the confidence interval are the same as for the hypothesis test and were checked in Example 15.

  1. From Table 1 in Chapter 8 on page 432, the value for a 99% confidence level is 2.576. Therefore, the margin of error is

    The margin of error is 0.079, so we may estimate to within 0.079 with 99% confidence.

  2. The point estimate is . The 99% confidence interval is therefore

    We are 99% confident that the difference in population proportions of males and females who agree that technology will lead to a better future lies between 0.081 and 0.239.

NOW YOU CAN DO

Exercises 13–18.

3 Use Confidence Intervals to Perform Tests for

Given a confidence interval for , we may perform two-tailed tests for various hypothesized values of . If a proposed value lies outside the confidence interval for , then the null hypothesis specifying this value would be rejected. Otherwise, do not reject the null hypothesis.

EXAMPLE 18 Using a interval for to perform tests about

This example asks whether differs from (or is not equal to) a certain value, so we can use the confidence interval to test the hypotheses. Example 17 provided a 99% confidence interval for , the difference in population proportions of males and females who agree that technology will lead to a better future, as (0.081, 0.239). Test, using level of significance , whether the differs from these values: (a) 0.1, (b) 0.2, (c) 0.3.

Solution

  1. versus .

    The hypothesized value 0.1 lies outside the interval (0.081, 0.239), so we reject .

  2. versus .

    The hypothesized value 0.2 lies inside the interval, so we do not reject .

  3. versus .

    The hypothesized value 0.3 lies outside the interval, so we reject .

NOW YOU CAN DO

Exercises 19–22.