Using inference wisely

In previous chapters, we have met the two major types of statistical inference: confidence intervals and significance tests. We have, however, seen only two inference methods of each type, one designed for inference about a population proportion p and the other designed for inference about a population mean m. There are libraries of both books and software filled with methods for inference about various parameters in various settings. The reasoning of confidence intervals and significance tests remains the same, regardless of the method. The first step in using inference wisely is to understand your data and the questions you want to answer and fit the method to its setting. Here are some tips on inference, adapted to the settings we are familiar with.

548

The design of the data production matters. “Where do the data come from?” remains the first question to ask in any statistical study. Any inference method is intended for use in a specific setting. For our confidence interval and test for a proportion p:

EXAMPLE 1 The psychologist and the women’s studies professor

A psychologist is interested in how our visual perception can be fooled by optical illusions. Her subjects are students in Psychology 101 at her university. Most psychologists would agree that it’s safe to treat the students as an SRS of all people with normal vision. There is nothing special about being a student that changes visual perception.

A professor at the same university uses students in Women’s Studies 101 to examine attitudes toward violence against women and reproductive rights. Students as a group are younger than the adult population as a whole. Even among young people, students as a group come from more prosperous and better-educated homes. Even among students, this university isn’t typical of all campuses. Even on this campus, students in a women’s studies course may have opinions that are quite different from those of students who do not take Women’s Studies 101. The professor can’t reasonably act as if these students are a random sample from any population of interest other than students taking Women’s Studies 101 at this university during this term.

Know how confidence intervals behave. A confidence interval estimates the unknown value of a parameter and also tells us how uncertain the estimate is. All confidence intervals share these behaviors:

549

image Dropping out An experiment found that weight loss is significantly more effective than exercise for reducing high cholesterol and high blood pressure. The 170 subjects were randomly assigned to a weight-loss program, an exercise program, or a control group. Only 111 of the 170 subjects completed their assigned treatment, and the analysis used data from these 111. Did the dropouts create bias? Always ask about details of the data before trusting inference.

Know what statistical significance says. Many statistical studies hope to show that some claim is true. A clinical trial compares a new drug with a standard drug because the doctors hope that the health of patients given the new drug will improve. A psychologist studying gender differences suspects that women will do better than men (on the average) on a test that measures social-networking skills. The purpose of significance tests is to weigh the evidence that the data give in favor of such claims. That is, a test helps us know if we found what we were looking for.

To do this, we ask what would happen if the claim were not true. That’s the null hypothesis—no difference between the two drugs, no difference between women and men. A significance test answers only one question: “How strong is the evidence that the null hypothesis is not true?” A test answers this question by giving a P-value. The P-value tells us how likely data as or more extreme than ours would be if the null hypothesis were true. Data that are very unlikely and have a small P-value are good evidence that the null hypothesis is not true. We usually don’t know whether the hypothesis is true for this specific population. All we can say is that “data as or more extreme than these would occur only 5% of the time if the hypothesis were true.”

550

This kind of indirect evidence against the null hypothesis (and for the effect we hope to find) is less straightforward than a confidence interval. We will say more about tests in the next section.

Know what your methods require. Our significance test and confidence interval for a population proportion p require that the population size be much larger than the sample size. They also require that the sample size itself be reasonably large so that the sampling distribution of the sample proportion is close to Normal. We have said little about the specifics of these requirements because the reasoning of inference is more important. Just as there are inference methods that fit stratified samples, there are methods that fit small samples and small populations. If you plan to use statistical inference in practice, you will need help from a statistician (or need to learn lots more statistics) to manage the details.

Most of us read about statistical studies more often than we actually work with data ourselves. Concentrate on the big issues, not on the details of whether the authors used exactly the right inference methods. Does the study ask the right questions? Where did the data come from? Do the results make sense? Does the study report confidence intervals so you can see both the estimated values of important parameters and how uncertain the estimates are? Does it report P-values to help convince you that findings are not just good luck?