Chapter 8: Inference for Proportions

Printed Page 519

SECTION 8.2 SUMMARY

• The large-sample estimate of the difference in two population proportions is

$D = {\hat{p}}_{1} - {\hat{p}}_{2}$

where ${\hat{p}}_{1}$ and ${\hat{p}}_{2}$ are the sample proportions:

${\hat{p}}_{1} = \frac{X_{1}}{n_{1}}$ and ${\hat{p}}_{2} = \frac{X_{2}}{n_{2}}$

• The standard error of the difference D is

${SE}_{D} = \sqrt{\frac{{\hat{p}}_{1} (1 - {\hat{p}}_{1})}{n_{1}} + \frac{{\hat{p}}_{2} (1 - {\hat{p}}_{2})}{n_{2}}}$
• The margin of error for confidence level C is

$m = z^{*} {SE}_{D}$

where z* is the value for the standard Normal density curve with area C between −z* and z*. The large-sample level C confidence interval is

D ± m

We recommend using this interval for 90%, 95%, or 99% confidence when the number of successes and the number of failures in both samples are all at least 10. When sample sizes are smaller, alternative procedures such as the plus four estimate of the difference in two population proportions are recommended.

• Significance tests of H₀: p₁ = p₂ use the z statistic

$z = \frac{{\hat{p}}_{1} - {\hat{p}}_{2}}{S E_{D p}}$

with P-values from the N(0, 1) distribution. In this statistic,

$S E_{D p} = \sqrt{\hat{p} (1 - \hat{p}) (\frac{1}{n_{1}} + \frac{1}{n_{2}})}$

and $\hat{p}$ is the pooled estimate of the common value of p₁ and p₂:

$\hat{p} = \frac{X_{1} + X_{2}}{n_{1} + n_{2}}$

Use this test when the number of successes and the number of failures in each of the samples are at least 5.

• Relative risk is the ratio of two sample proportions:

$RR = \frac{{\hat{p}}_{1}}{{\hat{p}}_{2}}$

Confidence intervals for relative risk are often used to summarize the comparison of two proportions.