Chapter Specifics
• A confidence interval estimates an unknown parameter. A test of significance assesses the evidence for some claim about the value of an unknown parameter.
536
• In practice, the purpose of a statistical test is to answer the question, “Could the effect we see in the sample just be an accident due to chance, or is it good evidence that the effect is really there in the population?’’
• Significance tests answer this question by giving the probability that a sample effect as large as the one we see in this sample would arise just by chance. This probability is the P-value. A small P-value says that our outcome is unlikely to happen just by chance.
• To set up a test, state a null hypothesis that says the effect you seek is not present in the population. The alternative hypothesis says that the effect is present.
• The P-value is the probability, calculated taking the null hypothesis to be true, of an outcome as extreme in the direction specified by the alternative hypothesis as the actually observed outcome.
• A sample result is statistically significant at the 5% level (or at the 0.05 level) if it would occur just by chance no more than 5% of the time in repeated samples.
In this chapter, we discuss tests of significance, another type of statistical inference. The mathematics of probability, in particular the sampling distributions discussed in Chapter 18, provides the formal basis for a test of significance. The sampling distribution allows us to assess “probabilistically’’ the strength of evidence against a null hypothesis, through either a level of significance or a P-value. The goal of hypothesis testing, which is used to assess the evidence provided by data about some claim concerning a population, is different from the goal of confidence interval estimation, discussed in Chapter 21, which is used to estimate a population parameter.
Although we have applied the reasoning of tests of significance to population proportions and population means, the same reasoning applies to tests of significance for other population parameters, such as the correlation coefficient, in more advanced settings. In the next chapter, we provide more discussion of the practical interpretation of statistical tests.
CASE STUDY EVALUATED Look again at the Case Study at the beginning of this chapter. Could the 2014 and 2013 random samples in the Higher Education Research Institute’s surveys of college freshmen differ by 18.0% versus 20.1% for those reporting spending at least 16 hours per week socializing with friends, and by 38.8% versus 36.3% for those reporting dedicating five hours per week or less to socializing, just by chance? Tests of significance can help answer these questions. In both cases, one finds that the P-values for the tests of whether two such random samples would differ by the amounts reported were less than 0.001.
537
1. Using language that can be understood by someone who knows no statistics, write a paragraph explaining what a P-value of less than 0.001 means in the context of the Higher Education Research Institute’s surveys of college freshmen.
2. Are the results of the study significant at the 0.05 level? At the 0.01 level? Explain.
Online Resources
• The Snapshots video Hypothesis Tests discusses the basic reasoning of tests of significance in the context of an example involving discrimination.
• The StatClips video P-value Interpretation discusses the interpretation of a P-