• The large-sample estimate of the difference in two population proportions is
D=ˆp1−ˆp2
where ˆp1 and ˆp2 are the sample proportions:
ˆp1=X1n1 and ˆp2=X2n2
• The standard error of the difference D is
SED=√ˆp1(1−ˆp1)n1+ˆp2(1−ˆp2)n2
• The margin of error for confidence level C is
m=z*SED
where z* is the value for the standard Normal density curve with area C between −z* and z*. The large-sample level C confidence interval is
D ± m
We recommend using this interval for 90%, 95%, or 99% confidence when the number of successes and the number of failures in both samples are all at least 10. When sample sizes are smaller, alternative procedures such as the plus four estimate of the difference in two population proportions are recommended.
• Significance tests of H0: p1 = p2 use the z statistic
z=ˆp1−ˆp2SEDp
with P-values from the N(0, 1) distribution. In this statistic,
SEDp=√ˆp(1−ˆp)(1n1+1n2)
and ˆp is the pooled estimate of the common value of p1 and p2:
ˆp=X1+X2n1+n2
Use this test when the number of successes and the number of failures in each of the samples are at least 5.
• Relative risk is the ratio of two sample proportions:
RR=ˆp1ˆp2
Confidence intervals for relative risk are often used to summarize the comparison of two proportions.