9 Hypothesis Testing

9.5 $Z$ Test for the Population Proportion

OBJECTIVES By the end of this section, I will be able to …

Perform the $Z$ test for $p$ using the critical-value method.
Perform the $Z$ test for $p$ using the $p$ -value method.
Use confidence intervals for $p$ to perform two-tailed hypothesis tests about $p$ .

1 The $Z$ Test for $p$ Using the Critical-Value Method

For example, if a baseball player has $x = 30$ hits in $n = 100$ at-bats, his batting average is $\hat{p} = x / n = 30 / 100 = 0.3$ (or .300).

Thus far, we have dealt with testing hypotheses about the population mean $μ$ only. In this section, we will learn how to perform the $Z$ test for the population proportion $p$ . For our point estimate of the unknown population proportion $p$ , we use the sample proportion $\hat{p} = x / n$ , where $x$ equals the number of successes.

Just as with the $Z$ test for the mean, in the $Z$ test for the proportion the null hypothesis will include a certain hypothesized value for the unknown parameter, which we call $p_{0}$ . For example, the hypotheses for the two-tailed test have the following form:

$\begin{array}{l} H_{0} : p = p_{0} & versus & H_{a} : p \neq p_{0} \end{array}$

where $p_{0}$ represents a particular hypothesized value of the unknown population proportion $p$ . For instance, if a researcher is interested in determining whether the population proportion of Americans who support increased funding for higher education differs from 50%, then $p_{0} = 0.50$ and $q_{0} = 1 - p_{0} = 0.50$ .

If we assume $H_{0}$ is correct, then the population proportion of successes is $p_{0}$ Then Facts 5 and 6 from Section 7.2 tell us that the sampling distribution of $p$ has a mean of $p_{0}$ and the standard deviation

$σ_{p} = \sqrt{\frac{p . q}{n}} = \sqrt{\frac{p_{0} . q_{0}}{n}}$

because we claim in $H_{0}$ that $p = p_{0}$ Here, $σ_{\hat{p}}$ is called the standard error of the proportion. Fact 7 from Section 7.2 tells us that the sampling distribution of $\hat{p}$ is approximately normal whenever both of the following conditions are met: $n \cdot p \geq 5$ and $n \cdot q \geq 5$ . This leads us to the following statement of the essential idea about hypothesis testing for the proportion.

The essential idea About Hypothesis testing for the Proportion

When the sample proportion $\hat{p}$ is unusual or extreme in the sampling distribution of $\hat{p}$ that is based on the assumption that $H_{0}$ is correct, we reject $H_{0}$ . Otherwise, there is insufficient evidence against $H_{0}$ , and we should not reject $H_{0}$ .

The remainder of this section explains the details of implementing hypothesis testing for the proportion. The critical-value method for the $Z$ test for $p$ is similar to that of the $Z$ test for $μ$ , in that we compare one $Z$ -value ( $Z_{data}$ ) with another $Z$ -value ( $Z_{crit}$ ). In this section, $Z_{data}$ represents the number of standard errors ( $σ_{\hat{p}}$ ) the sample proportion $\hat{p}$ lies above or below the hypothesized proportion $p_{0}$ .

The test statistic used for the $Z$ test for the proportion is

$Z_{data} = \frac{\hat{p} - p_{0}}{\sqrt{\frac{p_{0} \cdot q_{0}}{n}}}$

where $\hat{p}$ is the observed sample proportion of successes, $p_{0}$ is the value of $p$ hypothesized in $H_{0}$ , $q_{0} = 1 - p_{0}$ , and $n$ is the sample size.

Page 544

EXAMPLE 25 Calculating $Z_{data}$ for the $Z$ test for proportion

The NPD Group reported in 2013 that sales of the Chromebook accounted for 20% of the U.S. computer market. Suppose a random sample of $n = 400$ computers found 76 that were Chromebooks. We are interested in testing whether the population proportion of Chromebooks has changed from 20%.

Construct the hypotheses.
Calculate the test statistic $Z_{data}$ .

Solution

The key words “has changed” indicate a two-tailed test. “Changed from what?” The hypothesized proportion $p_{0} = 0.20$ . The hypotheses are

$\begin{array}{l} H_{0} : p = 0.20 & versus & H_{a} : p \neq 0.20 \end{array}$

The sample proportion of Chromebooks is

$\hat{p} = \frac{x}{n} = \frac{number in sample that are Chromebooks}{sample size} = \frac{76}{400} = 0.19$

We then calculate the value of the test statistic $Z_{data}$ :

$Z_{data} = \frac{\hat{p} - p_{0}}{\sqrt{\frac{p_{0} \cdot q_{0}}{n}}} = \frac{0.19 - 0.20}{\sqrt{\frac{0.20 (0.80)}{400}}} = \frac{- 0.01}{0.02} = - 0.5$

NOW YOU CAN DO

Exercises 7–14.

YOUR TURN#11

For Example 25, suppose the sample found 50 of 400 computers that were Chromebooks. Calculate the test statistic $Z_{data}$ .

(The solution is shown in Appendix A.)

To find the $Z_{crit}$ critical values, the critical regions, or the rejection rules, you can use Table 11.

Table 9.35: Table 11 Table of critical values

$Z_{crit}$ for common values of the level of significance

$α$

	Form of Hypothesis Test
Level of significance $α$	Right-tailed $\begin{array}{l} H_{0} : p = p_{0} \\ H_{a} : p > p_{0} \end{array}$	Left-tailed $\begin{array}{l} H_{0} : p = p_{0} \\ H_{a} : p < p_{0} \end{array}$	Two-tailed $\begin{array}{l} H_{0} : p = p_{0} \\ H_{a} : p \neq p_{0} \end{array}$
0.10	$Z_{crit} = 1.28$	$Z_{crit} = - 1.28$	$Z_{crit} = 1.645$
0.05	$Z_{crit} = 1.645$	$Z_{crit} = - 1.645$	$Z_{crit} = 1.96$
0.01	$Z_{crit} = 2.33$	$Z_{crit} = - 2.33$	$Z_{crit} = 2.58$

Rejection rule	Reject $H_{0}$ if $Z_{data} \geq Z_{crit}$	Reject $H_{0}$ if $Z_{data} \leq Z_{crit}$	Reject $H_{0}$ if $Z_{data} \leq - Z_{crit}$ or $Z_{data} \geq Z_{crit}$

Page 545

$Z$ test for the Population Proportion $p$ : Critical-value Method

When a random sample of size $n$ is taken from a population, you can use the $Z$ test for the proportion if both of the normality conditions are satisfied:

$\begin{array}{l} n \cdot p_{0} \geq 5 & and & n \cdot q_{0} \geq 5 \end{array}$

Step 1 State the hypotheses.

Use one of the forms from Table 11. State the meaning of $p$ .
Step 2 Find $Z_{crit}$ and state the rejection rule.

Use Table 11.
Step 3 Calculate $Z_{data}$ .

$Z_{data} = \frac{\hat{p} - p_{0}}{σ_{\hat{p}}} = \frac{\hat{p} - p_{0}}{\sqrt{\frac{p_{0} \cdot q_{0}}{n}}}$
Step 4 State the conclusion and the interpretation.

If $Z_{data}$ falls in the critical region, then reject $H_{0}$ . Otherwise, do not reject $H_{0}$ . Interpret the conclusion so that a nonspecialist can understand.

EXAMPLE 26 $Z$ test for $p$ using the critical-value method

As a check on your arithmetic, the two quantities you obtain when checking the normality conditions should add up to $n$ . Here, $80 + 320 = 400 = n$ .

Refer to Example 25. Test whether the population proportion of Chromebook computers has changed from 20%, using the critical-value method and level of significance $α = 0.10$ .

Solution

First, we check that both of our normality conditions are met. From Example 25, we have $p_{0} = 0.20$ and $n = 400$ .

$\begin{array}{l} n \cdot p_{0} = (400) (0.20) = 80 \geq 5 & and & n \cdot q_{0} = (400) (0.80) \end{array} = 320 \geq 5$

The normality conditions are met and we may proceed with the hypothesis test.

Step 1 State the hypotheses.

From Example 25, our hypotheses are

$\begin{array}{l} H_{0} : p = 0.20 & versus & H_{a} : p \neq 0.20 \end{array}$

where $p$ represents the population proportion of computers that are Chromebooks.
Step 2 Find $Z_{crit}$ and state the rejection rule.

We have a two-tailed test, with $α = 0.10$ . This gives us our critical value $Z_{crit} = 1.645$ . the rejection rule from Table 11 is: Reject $H_{0}$ if $Z_{data} \geq 1.645$ or $Z_{data} \leq - 1.645$ (Figure 35).

FIGURE 35 $Z_{data}$ does not fall in the critical region.
Step 3 Calculate $Z_{data}$ .

From Example 25, we have $Z_{data} = - 0.5$

Page 546
Step 4 State the conclusion and the interpretation.

The test statistic $Z_{data} = - 0.5$ is not $\geq 1.645$ and not $\leq - 1.645$ . Thus, we do not reject $H_{0}$ . There is insufficient evidence at level of significance $α = 0.10$ that the population proportion of computers that are Chromebooks differs from 20%.

NOW YOU CAN DO

Exercises 15–18.

2 $Z$ Test for $p$ : The $p$ -Value Method

The $p$ -value method for the $Z$ test for $p$ is equivalent to the critical-value method. The $p$ -values are defined similarly to those for the $Z$ test for $μ$ , as shown in Table 12.

Table 9.36: Table 12 Finding the

$p$ -value depends on the form of the hypothesis test

Type of test	Right-tailed test	Left-tailed test	Two-tailed test
Hypotheses	$\begin{array}{l} H_{0} : p = p_{0} \\ H_{a} : p > p_{0} \end{array}$	$\begin{array}{l} H_{0} : p = p_{0} \\ H_{a} : p < p_{0} \end{array}$	$\begin{array}{l} H_{0} : p = p_{0} \\ H_{a} : p \neq p_{0} \end{array}$
$p$ -value is tail area associated with $Z_{data}$	$\begin{array}{l} p -value = P (Z > Z_{data}) \\ Area to right of Z_{data} \end{array}$	$\begin{array}{l} p -value = P (Z > Z_{data}) \\ Area to left of Z_{data} \end{array}$	$\begin{array}{l} \begin{array}{l} p -value & = & P (Z > \| Z_{data} \|) \\ + P (Z < - \| Z_{data} \|) \\ = & 2 \cdot P (Z > \| Z_{data} \|) \end{array} \\ Sum of the two tail areas . \end{array}$

Note that the $p$ -value has precisely the same definition and behavior as in the $Z$ test for the population mean $μ$ . That is, the $p$ -value is roughly a measure of how extreme your value of $Z_{data}$ is and takes values between 0 and 1, with small values indicating extreme values of $Z_{data}$ .

Developing Your Statistical Sense

The Difference Between the $p$ -Value and the Population Proportion $p$

Be careful to distinguish between the $p$ -value and the population proportion p. The latter represents the population proportion of successes for a binomial experiment and is a population parameter. The $p$ -value is the probability of observing a value of $Z_{data}$ at least as extreme as the $Z_{data}$ actually observed. The $p$ -value depends on the sample data, but the population proportion $p$ does not depend on the sample data.

$Z$ test for the Population Proportion $p$ : $p$ -value Method

When a random sample of size $n$ is taken from a population, you can use the $Z$ test for the proportion if both of the normality conditions are satisfied:

$\begin{array}{l} n \cdot p_{0} \geq 5 & and & n \cdot q_{0} \geq 5 \end{array}$

Page 547

Step 1 State the hypotheses and the rejection rule.

Use one of the forms from Table 12. State the meaning of $p$ . State the rejection rule as “Reject $H_{0}$ when the $p -value \leq α$ .”
Step 2 Calculate $Z_{data}$ .

$Z_{data} = \frac{\hat{p} - p_{0}}{\sqrt{\frac{p_{0} \cdot q_{0}}{n}}}$
Step 3 Find the $p$ -value.

Either use technology to find the $p$ -value, or calculate it using the form in Table 12 that corresponds to your hypotheses.
Step 4 State the conclusion and the interpretation.

If the $p -value \leq α$ , then reject $H_{0}$ . Otherwise, do not reject $H_{0}$ . Interpret your conclusion so that a nonspecialist can understand.

EXAMPLE 27 $Z$ test for $p$ using the $p$ -value method

We report $Z_{data}$ to two decimal places to allow the use of the $Z$ table to calculate the $p$ -value.

The National Transportation Safety Board publishes statistics on the number of automobile crashes that people in various age groups have. Young people ages 18–24 have an accident rate of 12%, meaning that on average 12 out of every 100 young drivers per year had an accident. A researcher claims that the population proportion of young drivers having accidents is greater than 12%. Her study examined 1000 young drivers ages 18–24 and found that 134 had an accident this year. Perform the appropriate hypothesis test using the $p$ -value method with level of significance $α = 0.05$ .

Solution

First, we check that both of our normality conditions are met. We are interested in whether the proportion has increased from 12%, so we have $p_{0} = 0.12$ .

$\begin{array}{l} n \cdot p_{0} = (1000) (0.12) = 120 \geq 5 & and & n \cdot q_{0} = (1000) (0.88) = 880 \geq 5 \end{array}$

The normality conditions are met and we may proceed with the hypothesis test.

Step 1 State the hypotheses and the rejection rule.

Our hypotheses are

$\begin{array}{l} H_{0} : p = 0.12 & versus & H_{a} : p > 0.12 \end{array}$

where $p$ represents the population proportion of young people ages 18–24 who had an accident. We reject the null hypothesis if the $p-value \leq α = 0.05$ .
Step 2 Calculate $Z_{data}$ .

Our sample proportion is $\hat{p} = 134 / 1000 = 0.134$ . Because $p_{0} = 0.12$ , the standard error of $\hat{p}$ is

$σ_{\hat{p}} = \sqrt{\frac{p_{0} \cdot q_{0}}{n}} = \sqrt{\frac{(0.12) (0.88)}{1000}} \approx 0.0103$

Thus, our test statistic is

$Z_{data} = \frac{\hat{p} - p_{0}}{\sqrt{\frac{p_{0} . q_{0}}{n}}} = \frac{0.134 - 0.12}{\sqrt{\frac{(0.12) (0.88)}{1000}}} \approx 1.36$

Page 548

FIGURE 36 $p$ -Value for a right-tailed test equals area to right of $Z_{data}$ .

That is, the sample proportion $\hat{p} = 0.134$ lies approximately 1.36 standard errors above the hypothesized proportion $p_{0} = 0.12$ .
Step 3 Find the $p$ -value.

We have a right-tailed test, so our $p$ -value from Table 12 is $P (Z > Z_{data})$ . This is a Case 2 problem from Table 8 in Chapter 6 (page 355), where we find the tail area by subtracting the $Z$ table area from 1 (Figure 36):

$P (Z > Z_{data}) = P (Z > 1.36) = 1 - 0.9131 = 0.0869$
Step 4 State the conclusion and the interpretation.

The $p$ -value 0.0869 is not ≤ $α = 0.05$ , so we do not reject $H_{0}$ . There is insufficient evidence that the population proportion of young people ages 18–24 who had an accident has increased.

NOW YOU CAN DO

Exercises 19–22.

YOUR TURN#12

For Example 27, suppose 150 of the 1000 young drivers ages 18–24 had an accident this year. Now test whether the population proportion of young drivers who had an accident exceeds 0.12, using level of significance, $α = 0.05$ .

(The solution is shown in Appendix A.)

EXAMPLE 28 Performing the $Z$ test for $p$ using technology

A study reported that 1% of American Internet users who are married or in a long-term relationship met on a blind date or through a dating service.¹³ A survey of 500 American Internet users who are married or in a long-term relationship found 8 who met on a blind date or through a dating service. If appropriate, test whether the population proportion has increased. Use the $p$ -value method with level of significance $α = 0.05$ .

Solution

We have $p_{0} = 0.01$ and $n = 500$ . Checking the normality conditions, we have

$\begin{array}{l} n \cdot p_{0} = (500) (0.01) = 5 \geq 5 & and & n \cdot q_{0} = (500) (0.99) = 495 \geq 5 \end{array}$

The normality conditions are met and we may proceed with the hypothesis test.

Step 1 State the hypotheses and the rejection rule.

Our hypotheses are

$\begin{array}{l} H_{0} : p = 0.01 & versus & H_{a} : p > 0.01 \end{array}$

where $p$ represents the population proportion of American Internet users who are married or in a long-term relationship and who met on a blind date or through a dating service. We will reject $H_{0}$ if the $p$ -value # 0.05.
Step 2 Calculate $Z_{data}$ .

We use the instructions supplied in the Step-by-Step Technology Guide on page 552. Figure 37 shows the TI-83/84 results from the $Z$ test for $p$ , Figure 38 shows the results from Minitab, and Figure 39 shows the results from CrunchIt!.

FIGURE 37 TI-83/84 results.

Page 549

Note: Minitab, TI-83/84, and CrunchIt! round results to different numbers of decimal places.

FIGURE 38 Minitab results.

We have

$Z_{data} = \frac{\hat{p} - p_{0}}{\sqrt{\frac{p_{0} \cdot q_{0}}{n}}} = \frac{0.016 - 0.01}{\sqrt{\frac{(0.01) (0.99)}{500}}} \approx 1.348399725$

which concurs with the TI-83/84 results in Figure 37.
Step 3 Find the $p$ -value.

From Figures 37, 38, 39, and 40 we have

$p -value = P (Z > 1.348399725) = 0.0887649866 \approx 0.08876$

FIGURE 39 CrunchIt! results.

FIGURE 40 $p$ -Value for a right-tailed test.
Step 4 State the conclusion and interpretation.

Because $p-value \approx 0.08876$ is $n o t \leq α = 0.05$ , we do not reject $H_{0}$ . There is insufficient evidence that the population proportion of American Internet users who are married or in a long-term relationship and who met on a blind date or through a dating service has increased.

3 Using Confidence Intervals for $p$ to Perform Two-Tailed Hypothesis Tests About $p$

Just as for $μ$ , we can use a $100 (1 - α) %$ confidence interval for the population proportion $p$ in order to perform a set of two-tailed hypothesis tests for $p$ .

EXAMPLE 29 Using a confidence interval for $p$ to perform two-tailed hypothesis tests about $p$

In 2013, Facebook reported that 73% of its users access Facebook using a mobile device. Suppose that a 95% confidence interval for the population of mobile accessers is (lower bound = 0.70, upper bound = 0.76). Use the confidence interval to test, using level of significance $α = 0.05$ , whether the population proportion differs from

Page 550

0.69
0.72
0.77

Solution

There is equivalence between a $100 (1 - α) %$ confidence interval for $p$ and a two-tailed test for $p$ with level of significance $α$ . Values of $p_{0}$ that lie outside the confidence interval lead to rejection of the null hypothesis, whereas values of $p_{0}$ within the confidence interval lead to not rejecting the null hypothesis. Figure 41 illustrates the 95% confidence interval for $p$ .

FIGURE 41 Reject

$H_{0}$ for values

$p_{0}$ that lie outside the interval (0.70, 0.76).

We want to perform the following two-tailed hypothesis tests:

$H_{0} : p = 0.69 versus H_{a} : p \neq 0.69$
$H_{0} : p = 0.72 versus H_{a} : p \neq 0.72$
$H_{0} : p = 0.77 versus H_{a} : p \neq 0.77$

To perform each hypothesis test, simply observe where each value of $p_{0}$ falls on the number line. For example, in the first hypothesis test, the hypothesized value $p_{0} = 0.69$ lies outside the interval (0.70, 0.76). Thus, we reject $H_{0}$ . The three hypothesis summarized here.

Value of $p_{0}$	Form of hypothesis test, with $α = 0.05$	Where $p_{0}$ lies in relation to 95% confidence interval	Conclusion of hypothesis test
a. 0.69	$\begin{matrix} H_{0} : p = 0.69 & H_{a} : p \neq 0.69 \end{matrix}$	Outside	Reject $H_{0}$
b. 0.72	$\begin{matrix} H_{0} : p = 0.72 & H_{a} : p \neq 0.72 \end{matrix}$	Inside	Do not reject $H_{0}$
c. 0.77	$\begin{matrix} H_{0} : p = 0.77 & H_{a} : p \neq 0.77 \end{matrix}$	Outside	Reject $H_{0}$

NOW YOU CAN DO

Exercises 23–26.

EXAMPLE 30 interpreting software output

Each of (a) and (b) represent software output from a $Z$ test for $p$ . For each, examine the indicated software output, and provide the following steps:

Step 1 State the hypotheses and the rejection rule.
Step 2 Find $Z_{data}$ .
Step 3 Find the $p$ -value.
Step 4 State the conclusion and the interpretation.

Use level of significance $α = 0.05$ for each hypothesis test.

TI-83/84 output for a $Z$ test for p, where $p$ represents the population proportion of quiz questions answered correctly.

Page 551
Minitab output for a $Z$ test for $p$ , where $p$ represents the population proportion of counties having at least one specialty store.

Solution

Interpreting the TI-83/84 output
- Step 1 State the hypotheses and the rejection rule.
  
  In the TI-83/84 output, the “ $prop > .75$ ” tells us that we have a right-tailed test:
  
  $\begin{array}{l} H_{0} : p = 0.75 & versus & H_{a} : p > 0.75 \end{array}$
  
  where $p$ represents the population proportion of quiz questions answered correctly. We will reject $H_{0}$ if the $p$ -value is less than the level of significance $α = 0.05$
- Step 2 Find $Z_{data}$ .
  
  The “ $z = 2.309401077$ ” in the TI-83/84 output gives us the value for $Z_{data}$ .
- Step 3 Find the $p$ -value.
  
  Here, we need to be a little bit careful, because there are two items containing $p$ in the TI-83/84 output. Don't pick $\hat{p}$ , which represents the sample proportion of successes. Instead, the $p$ -value is given as “ $p = .0104606407$ .”
- Step 4 State the conclusion and the interpretation.
  
  The $p$ -value is less than the level of significance $α = 0.05$ , so we reject $H_{0}$ . There is evidence that the population proportions of quiz questions answered correctly is greater than 0.75.
Interpreting the Minitab output
- Step 1 State the hypotheses and the rejection rule.
  
  The line “Test of $p = 0.62$ vs $p \neq 0.62$ ” tells us that we have the following two-tailed test.
  
  $\begin{array}{l} H_{0} : p = 0.62 & versus & H_{a} : p \neq 0.62 \end{array}$
  
  where $p$ represents the population proportion of counties having at least one specialty store. We will reject $H_{0}$ if the $p$ -value is less than the level of significance $α = 0.05$ .
- Step 2 Find $Z_{data}$ .
  
  The “ $Z$ -value” of 1.46 in the Minitab output gives us the value for $Z_{data}$ .
- Step 3 Find the $p$ -value.
  
  Under “P-Value,” Minitab gives us $p -value = 0.143$ .
- Step 4 State the conclusion and the interpretation.
  
  The $p$ -value is not less than the level of significance $α = 0.05$ , so we do not reject $H_{0}$ . There is insufficient evidence that the population proportions of counties having at least one specialty store differs from 0.62.

NOW YOU CAN DO

Exercises 27–30.

Page 552

9.5 Z<math><mi>Z</mi></math> Test for the Population Proportion

9.5 $Z$ Test for the Population Proportion