6.2 Estimating with Confidence

The SAT is a widely used measure of readiness for college study. It consists of three sections, one for mathematical reasoning ability (SATM), one for verbal reasoning ability (SATV), and one for writing ability (SATW). Possible scores on each section range from 200 to 800, for a total range of 600 to 2400. Since 1995, section scores have been recentered so that the mean is approximately 500 with a standard deviation of 100 in a large “standardized group.” This scale has been maintained so that scores have a constant interpretation.

EXAMPLE 6.9 Estimating the Mean SATM Score for Seniors in California

Suppose that you want to estimate the mean SATM score for the 486,549 high school seniors in California.7 You know better than to trust data from the students who choose to take the SAT. Only about 38% of California students typically take the SAT. These self-selected students are planning to attend college and are not representative of all California seniors. At considerable effort and expense, you give the test to a simple random sample (SRS) of 500 California high school seniors. The mean score for your sample is . What can you say about the mean score in the population of all 486,549 seniors?

Reminder

image

law of large numbers, p. 222

The sample mean is the natural estimator of the unknown population mean . We know that is an unbiased estimator of . More important, the law of large numbers says that the sample mean must approach the population mean as the size of the sample grows. The value , therefore, appears to be a reasonable estimate of the mean score that all 486,549 students would achieve if they took the test. But how reliable is this estimate? A second sample of 500 students would surely not give a sample mean of 485 again. Unbiasedness says only that there is no systematic tendency to underestimate or overestimate the truth. Could we plausibly get a sample mean of 465 or 510 in repeated samples? An estimate without an indication of its variability is of little value.

Statistical confidence

Reminder

image

unbiased estimator, p. 279

The unbiasedness of an estimator concerns the center of its sampling distribution, but questions about variation are answered by looking at its spread. From the central limit theorem, we know that if the entire population of SATM scores has mean and standard deviation , then in repeated samples of size 500 the sample mean is approximately Let us suppose that we know that the standard deviation of SATM scores in our California population is . (We see in the next chapter how to proceed when is not known. For now, we are more interested in statistical reasoning than in details of realistic methods.) This means that, in repeated sampling, the sample mean has an approximately Normal distribution centered at the unknown population mean and a standard deviation of

303

Now we are ready to proceed. Consider this line of thought, which is illustrated by Figure 6.7:

We have simply restated a fact about the sampling distribution of . The language of statistical inference uses this fact about what would happen in the long run to express our confidence in the results of any one sample. Our sample gave . We say that we are 95% confident that the unknown mean score for all California seniors lies between

and

Be sure you understand the grounds for our confidence. There are only two possibilities for our SRS:

  1. The interval between 476 and 494 contains the true .
  2. The interval between 476 and 494 does not contain the true
image
Figure 6.7: FIGURE 6.7 Distribution of the sample mean, Example 6.9. lies within points of in 95% of all samples. This also means that is within points of in those samples.

304

We cannot know whether our sample is one of the 95% for which the interval contains or one of the unlucky 5% for which it does not contain. The statement that we are 95% confident is shorthand for saying, “We arrived at these numbers by a method that gives correct results 95% of the time.”

Apply Your Knowledge

Question 6.23

6.23 Company invoices.

The mean amount for all the invoices for your company last month is not known. Based on your past experience, you are willing to assume that the standard deviation of invoice amounts is about $260. If you take a random sample of 100 invoices, what is the value of the standard deviation for ?

6.23

.

Question 6.24

6.24 Use the 68–95–99.7 rule.

In the setting of the previous exercise, the 68–95–99.7 rule says that the probability is about 0.95 that is within ________ of the population mean . Fill in the blank.

Question 6.25

6.25 An interval for 95% of the sample means.

In the setting of the previous two exercises, about 95% of all samples will capture the true mean of all the invoices in the interval plus or minus ________. Fill in the blank.

6.25

$52.

Confidence intervals

In the setting of Example 6.9 (page 302), the interval of numbers between the values is called a 95% confidence interval for Like most confidence intervals we will discuss, this one has the form

margin of error

The estimate ( in this case) is our guess for the value of the unknown parameter. The margin of error (9 here) reflects how accurate we believe our guess is, based on the variability of the estimate, and how confident we are that the procedure will produce an interval that will contain the true population mean .

Figure 6.8 illustrates the behavior of 95% confidence intervals in repeated sampling from a Normal distribution with mean . The center of each interval (marked by a dot) is at and varies from sample to sample. The sampling distribution of (also Normal) appears at the top of the figure to show the long-term pattern of this variation.

The 95% confidence intervals, margin of error, from 25 SRSs appear below the sampling distribution. The arrows on either side of the dot () span the confidence interval. All except one of the 25 intervals contain the true value of In those intervals that contain , sometimes is near the middle of the interval and sometimes it is closer to one of the ends. This again reflects the variation of In practice, we don’t know the value of , but we have a method such that, in a very large number of samples, 95% of the confidence intervals will contain .

Statisticians have constructed confidence intervals for many different parameters based on a variety of designs for data collection. We meet a number of these in later chapters. Two important things about a confidence interval are common to all settings:

  1. It is an interval of the form (, ), where and are numbers computed from the sample data.
  2. It has a property called a confidence level that gives the probability of producing an interval that contains the unknown parameter.

Users can choose the confidence level, but 95% is the standard for most situations. Occasionally, 90% or 99% is used. We will use to stand for the confidence level in decimal form. For example, a 95% confidence level corresponds to .

305

image
Figure 6.8: FIGURE 6.8 Twenty-five samples from the same population gave these 95% confidence intervals. In the long run, 95% of all samples give an interval that covers .

Confidence Interval

A level confidence interval for a parameter is an interval computed from sample data by a method that has probability of producing an interval containing the true value of the parameter.

With the Confidence Interval applet, you can construct diagrams similar to the one displayed in Figure 6.8. The only difference is that the applet displays the Normal population distribution at the top along with the Normal sampling distribution of . You choose the confidence level , the sample size , and whether you want to generate 1 or 25 samples at a time. A running total (and percent) of the number of intervals that contain is displayed so you can consider a larger number of samples.

When generating single samples, the data for the latest SRS are shown below the confidence interval. The spread in these data reflects the spread of the population distribution. This spread is assumed known, and it does not change with sample size. What does change, as you vary , is the margin of error because it reflects the uncertainty in the estimate of . As you increase , you’ll find that the span of the confidence interval gets smaller and smaller.

Apply Your Knowledge

Question 6.26

6.26 Generating a single confidence interval.

Using the default settings in the Confidence Interval applet (95% confidence level and ), click “Sample” to choose an SRS and display its confidence interval.

  1. Is the spread in the data, shown as yellow dots below the confidence interval, larger than the span of the confidence interval? Explain why this would typically be the case.
  2. For the same data set, you can compare the span of the confidence interval for different values of by sliding the confidence level to a new value. For the SRS you generated in part (a), what happens to the span of the interval when you move to 99%? What about 90%? Describe the relationship you find between the confidence level and the span of the confidence interval.

306

Question 6.27

6.27 80% confidence intervals.

The idea of an 80% confidence interval is that the interval captures the true parameter value in 80% of all samples. That’s not high enough confidence for practical use, but 80% hits and 20% misses make it easy to see how a confidence interval behaves in repeated samples from the same population.

  1. Set the confidence level in the Confidence Interval applet to 80%. Click “Sample 25” to choose 25 SRSs and display their confidence intervals. How many of the 25 intervals contain the true mean ? What proportion contain the true mean?
  2. We can’t determine whether a new SRS will result in an interval that contains or not. The confidence level only tells us what percent will contain in the long run. Click “Sample 25” again to get the confidence intervals from 50 SRSs. What proportion hit? Keep clicking “Sample 25” and record the proportion of hits among 100, 200, 300, 400, and 500 SRSs. As the number of samples increases, we expect the percent of captures to get closer to the confidence level, 80%. Do you find this pattern in your results?

Confidence interval for a population mean

We will now construct a level confidence interval for the mean of a population when the data are an SRS of size . The construction is based on the sampling distribution of the sample mean . This distribution is exactly when the population has the distribution. The central limit theorem says that this same sampling distribution is approximately correct for large samples whenever the population mean and standard deviation are and . For now, we will assume we are in one of these two situations. We discuss what we mean by “large sample” after we briefly study these intervals.

Our construction of a 95% confidence interval for the mean SATM score began by noting that any Normal distribution has probability about 0.95 within standard deviations of its mean. To construct a level confidence interval, we first catch the central area under a Normal curve. That is, we must find the critical value such that any Normal distribution has probability within standard deviations of its mean.

Because all Normal distributions have the same standardized form, we can obtain everything we need from the standard Normal curve. Figure 6.9 shows how and are related. Values of for many choices of appear in the row labeled at the bottom of Table D. Here are the most important entries from that row:

image
Figure 6.9: FIGURE 6.9 The area between the critical values and under the standard Normal curve is .

307

1.645 1.960 2.576
90% 95% 99%

Notice that for 95% confidence, the value 2 obtained from the 68–95–99.7 rule is replaced with the more precise 1.96.

As Figure 6.9 reminds us, any Normal curve has probability between the point standard deviations below the mean and the point standard deviations above the mean. The sample mean has the Normal distribution with mean and standard deviation so there is probability that lies between

This is exactly the same as saying that the unknown population mean lies between

That is, there is probability that the interval contains . This is our confidence interval. The estimate of the unknown is , and the margin of error is .

Confidence Interval for a Population Mean

Choose an SRS of size from a population having unknown mean and known standard deviation . The margin of error for a level confidence interval for is

Here, is the value on the standard Normal curve with area between the critical points and . The level confidence interval for is

The confidence level of this interval is exactly when the population distribution is Normal and is approximately when is large in other cases.

EXAMPLE 6.10 Average Credit Card Balance among College Students

Starting in 2008, Sallie Mae, a major provider of education loans and savings programs, has conducted an annual study titled “How America Pays for College.” Unlike other studies on college funding, this study assesses all aspects of spending and borrowing, for both educational and noneducational purposes. In the 2012 survey, 1601 randomly selected individuals (817 parents of undergraduate students and 784 undergraduate students) were surveyed by telephone.8

Many of the survey questions focused on the undergraduate student, so the parents in the survey were responding for their children. Do you think we should combine responses across these two groups? Do you think your parents are fully aware of your spending and borrowing habits? The authors reported overall averages and percents in their report but did break things down by group in their data tables. For now, we consider this a sample from one population, but we revisit this issue later.

image

One survey question asked about the undergraduate’s current total outstanding balance on credit cards. Of the 1601 who were surveyed, only provided an answer. Nonresponse should always be considered as a source of bias. In this case, the authors believed this nonresponse to be an ignorable source of bias and proceeded by treating the sample as if it were a random sample. We will do the same.

308

The average credit card balance was $755. The median balance was $196, so this distribution is clearly skewed. Nevertheless, because the sample size is quite large, we can rely on the central limit theorem to assure us that the confidence interval based on the Normal distribution will be a good approximation.

Let’s compute an approximate 95% confidence interval for the true mean credit card balance among all undergraduates. We assume that the standard deviation for the population of credit card debts is $1130. For 95% confidence, we see from Table D that . The margin of error for the 95% confidence interval for is, therefore,

We have computed the margin of error with more digits than we really need. Our mean is rounded to the nearest $1, so we do the same for the margin of error. Keeping additional digits would provide no additional useful information. Therefore, we use . The approximate 95% confidence interval is

We are 95% confident that the average credit card debt among all undergraduates is between $659 and $851.

Suppose that the researchers who designed this study had used a different sample size. How would this affect the confidence interval? We can answer this question by changing the sample size in our calculations and assuming that the sample mean is the same.

EXAMPLE 6.11 How Sample Size Affects the Confidence Interval

As in Example 6.10, the sample mean of the credit card debt is $755 and the population standard deviation is $1130. Suppose that the sample size is only 133 but still large enough for us to rely on the central limit theorem. In this case, the margin of error for 95% confidence is

and the approximate 95% confidence interval is

309

image
Figure 6.10: FIGURE 6.10 Confidence intervals for and , Examples 6.10 and 6.11. A sample size four times as large results in a confidence interval that is half as wide.

Notice that the margin of error for this example is twice as large as the margin of error that we computed in Example 6.10. The only change that we made was to assume a sample size of 133 rather than 532. This sample size is one-fourth of the original 532. Thus, we double the margin of error when we reduce the sample size to one-fourth of the original value. Figure 6.10 illustrates the effect in terms of the intervals.

Apply Your Knowledge

Question 6.28

6.28 Average amount paid for college.

Refer to Example 6.10 (pages 307308). The average annual amount the families paid for college was $20,902.9 If the population standard deviation is $7500, give the 95% confidence interval for , the average amount a family pays for a college undergraduate.

Question 6.29

6.29 Changing the sample size.

In the setting of the previous exercise, would the margin of error for 95% confidence be roughly doubled or halved if the sample size were raised to ? Verify your answer by performing the calculations.

6.29

Halved; the margin of error is 183.75.

Question 6.30

6.30 Changing the confidence level.

In the setting of Exercise 6.28, would the margin of error for 99% confidence be larger or smaller? Verify your answer by performing the calculations.

The argument leading to the form of confidence intervals for the population mean rested on the fact that the statistic used to estimate has a Normal distribution. Because many sample estimates have Normal distributions (at least approximately), it is useful to notice that the confidence interval has the form

The estimate based on the sample is the center of the confidence interval. The margin of error is . The desired confidence level determines from Table D. The standard deviation of the estimate is found from knowledge of the sampling distribution in a particular case. When the estimate is from an SRS, the standard deviation of the estimate is . We return to this general form numerous times in the following chapters.

How confidence intervals behave

The margin of error for the mean of a Normal population illustrates several important properties that are shared by all confidence intervals in common use. The user chooses the confidence level, and the margin of error follows from this choice.

310

Both high confidence and a small margin of error are desirable characteristics of a confidence interval. High confidence says that our method almost always gives correct answers. A small margin of error says that we have pinned down the parameter quite precisely.

Suppose that in planning a study you calculate the margin of error and decide that it is too large. Here are your choices to reduce it:

For most problems, you would choose a confidence level of 90%, 95%, or 99%, so will be 1.645, 1.960, or 2.576, respectively. Figure 6.9 (page 306) shows that will be smaller for lower confidence (smaller ). The bottom row of Table D also shows this. If and are unchanged, a smaller leads to a smaller margin of error.

EXAMPLE 6.12 How the Confidence Level Affects the Confidence Interval

Suppose that for the student credit card data in Example 6.10 (pages 307308), we wanted 99% confidence. Table D tells us that for 99% confidence, . The margin of error for 99% confidence based on 532 observations is

and the 99% confidence interval is

Requiring 99%, rather than 95%, confidence has increased the margin of error from 96 to 126. Figure 6.11 compares the two intervals.

image
Figure 6.11: FIGURE 6.11 Confidence intervals, Examples 6.10 and 6.12. The larger the value of , the wider the interval.

Similarly, choosing a larger sample size reduces the margin of error for any fixed confidence level. The square root in the formula implies that we must multiply the number of observations by 4 in order to cut the margin of error in half. If we want to reduce the margin of error by a factor of 4, we must take a sample 16 times as large. By rearranging the margin of error formula, we can solve for that will give a desired margin error. Here is the result.

311

Sample Size for Specified Margin of Error

The confidence interval for a population mean will have a specified margin of error when the sample size is

image

In the case where the underlying population is Normal, this formula provides the minimum necessary sample size to achieve a specified margin of error. However, for populations that are not Normal, beware that this formula might not result in a sample size that is large enough for to be sufficiently close to the Normal.

Finally, the margin of error is directly related to size of the standard deviation , the measure of population variation. You can think of the variation among individuals in the population as noise that obscures the average value . It is harder to pin down the mean of a highly variable population; that is why the margin of error of a confidence interval increases with .

In practice, we can sometimes reduce by carefully controlling the measurement process. We also might change the mean of interest by restricting our attention to only part of a large population. Focusing on a subpopulation will often result in a smaller .

Apply Your Knowledge

Question 6.31

6.31 Starting salaries.

You are planning a survey of starting salaries for recent business majors. In the latest survey by the National Association of Colleges and Employers, the average starting salary was reported to be $55,144.10 If you assume that the standard deviation is $11,000, what sample size do you need to have a margin of error equal to $1000 with 95% confidence?

6.31

Question 6.32

6.32 Changes in sample size.

Suppose that, in the setting of the previous exercise, you have the resources to contact 500 recent graduates. If all respond, will your margin of error be larger or smaller than $1000? What if only 50% respond? Verify your answers by performing the calculations.

Some cautions

image

We have already seen that small margins of error and high confidence can require large numbers of observations. You should also be keenly aware that any formula for inference is correct only in specific circumstances. If the government required statistical procedures to carry warning labels like those on drugs, most inference methods would have long labels. Our formula for estimating a population mean comes with the following list of warnings for the user:

Reminder

image

standard deviation s, p. 31

The most important caution concerning confidence intervals is a consequence of the first of these warnings. The margin of error in a confidence interval covers only random sampling errors. The margin of error is obtained from the sampling distribution and indicates how much error can be expected because of chance variation in randomized data production.

Practical difficulties such as undercoverage and nonresponse in a sample survey cause additional errors. These errors can be larger than the random sampling error. This often happens when the sample size is large (so that is small). Remember this unpleasant fact when reading the results of an opinion poll or other sample survey. The practical conduct of the survey influences the trustworthiness of its results in ways that are not included in the announced margin of error.

Every inference procedure that we will meet has its own list of warnings. Because many of the warnings are similar to those we have mentioned, we do not print the full warning label each time. It is easy to state (from the mathematics of probability) conditions under which a method of inference is exactly correct. These conditions are never fully met in practice.

For example, no population is exactly Normal. Deciding when a statistical procedure should be used in practice often requires judgment assisted by exploratory analysis of the data. Mathematical facts are, therefore, only a part of statistics. The difference between statistics and mathematics can be stated thusly: mathematical theorems are true; statistical methods are often effective when used with skill.

Finally, you should understand what statistical confidence does not say. Based on our SRS in Example 6.9 (page 302), we are 95% confident that the mean SATM score for the California students lies between 476 and 494. This says that this interval was calculated by a method that gives correct results in 95% of all possible samples. It does not say that the probability is 0.95 that the true mean falls between 476 and 494. No randomness remains after we draw a particular sample and compute the interval. The true mean either is or is not between 476 and 494. The probability calculations of standard statistical inference describe how often the method, not a particular sample, gives correct answers.

313

Apply Your Knowledge

Question 6.33

6.33 Nonresponse in a survey.

Let’s revisit Example 6.10 (pages 307308). Of the 1601 participants in the survey, only 532 reported the undergraduate’s outstanding credit card balance. For that example, we proceeded as if we had a random sample and calculated a margin of error at 95% confidence of $96. Provide a couple of reasons a survey respondent might not provide an estimate. Based on these reasons, do you think that this margin of error of $96 is a good measure of the accuracy of the survey’s results? Explain your answer.

6.33

Answers will vary. We may not get a response for a variety of reasons. Regardless, it is likely the 532 who responded are different than those who didn’t respond so that our estimated margin of error is not a good measure of accuracy.