8.2 Interval for the Population Mean

OBJECTIVES By the end of this section, I will be able to …

  1. Describe the characteristics of the distribution.
  2. Calculate and interpret a interval for the population mean.

1 Introducing the Distribution

In Section 8.1, we constructed confidence intervals for the population mean assuming that the population standard deviation was known. This assumption may be valid for certain fields such as quality control. However, in many real-world problems, we do not know the value of , and thus cannot use a interval to estimate the mean. When is unknown, we use the sample standard deviation to construct a confidence interval that is likely to contain the population mean.

449

image

The distribution was discovered in 1908 by William Sealy Gosset, while working as an analyst on the selection of the best-yielding varieties of barley at the Guinness Brewery in Dublin, Ireland. Guinness required that his research be anonymous, so Gosset published his work under the pen name Student, so that the distribution is often called Student's distribution. It should of course be called Gosset's distribution.

Fact 4 from Chapter 7 showed us that we could standardize to derive the standard normal random variable:

Unfortunately, however, if we replace the unknown in this equation with the known , we can no longer obtain the standard normal because , being a statistic, is itself a random variable. Instead, the quantity

follows an entirely new and different distribution, called the distribution.

Distribution

For a normal population, the distribution of

follows a distribution, with degrees of freedom, where is the sample mean, is the unknown population mean, is the sample standard deviation, and is the sample size.

Developing Your Statistical Sense

Degrees of Freedom

Notice that the definition of the distribution includes a new concept called degrees of freedom. Degrees of freedom is a measure that determines how the distribution changes as the sample size changes. The idea of degrees of freedom is that, in a sum of numbers, you need to know only the first of these numbers to find the th number because you already know the sum. For example, suppose you know that the sum of numbers is 10 and are told that the first two numbers are 5 and 1. Then you can deduce that the last number is . The first two numbers have the freedom to take on any values, but the third number must take a particular value. Thus, there are only independent pieces of information. The concept is similar for the distribution. Because we use the sample standard deviation to estimate the unknown and because is known, only independent pieces of information are needed to find the value of . Thus, we say that follows a distribution with degrees of freedom.

Figure 14 displays a comparison of some curves with the curve. Note that there is only one distribution (or curve), but there is a different curve for every different degrees of freedom (df), that is, for every different sample size. The degrees of freedom, , determines the shape of the distribution, just as the mean and variance uniquely determine the shape of the normal distribution. All curves have several characteristics in common.

image
Figure 8.14: FIGURE 14 Different curve for different degrees of freedom ().

450

Characteristics of the Distribution

  • Centered at zero. The mean of is zero, just as with .
  • Symmetric about its mean zero, just as with .
  • As the degrees of freedom decreases, the curve gets flatter, and the area under the curve decreases in the center and increases in the tails. That is, the curve has heavier tails than the curve.
  • As degrees of freedom increases toward infinity, the curve approaches the curve, and the area under the curve increases in the center and decreases in the tails.

Similar to the definition of in Section 8.1, we can define to be the value of the distribution with area to the right of it, as seen in Figure 15. Table 1 in Section 8.1 provides the values for certain common confidence levels. Unfortunately, because there is a different curve for each sample size, there are many possible values. You will need to use the table (Table D in the Appendix) to find the value of , as follows.

image
Figure 8.15: FIGURE 15 has area to the right of it.

Procedure for Finding

  • Step 1 Go across the row marked “Confidence level” in the table (Table D in the Appendix) until you find the column with the desired confidence level at the top. The value is in this column somewhere.
  • Step 2 Go down the column until you see the correct number of degrees of freedom on the left. The number in that row and column is the desired value of .

EXAMPLE 12 Finding

Find the value of that will produce a 95% confidence interval for if the sample size is .

Solution

  • Step 1 We go across the row labeled “Confidence level” in the table (Figure 16) until we see the 95% confidence level. Our is somewhere in this column.
  • Step 2 The degrees of freedom are . We go down the column until we see 19 on the left. The number in that row is our , 2.093.

Note: For the newer TI-84s

  1. Press 2nd DISTR and select 4:invT.
  2. Enter the area to the left of the value, then comma, then .
  3. Press ENTER.

451

For example, invT(0.975,19) gives 2.093024022. The TI-83 does not have this function.

image
Figure 8.16: FIGURE 16 Use the confidence level and the degrees of freedom to find .

NOW YOU CAN DO

Exercises 5–8.

YOUR TURN #7

Find the value of that will produce a 90% confidence interval for if the sample size is .

(The solution is shown in Appendix A.)

2 Interval for the Population Mean

The distribution provides the following confidence interval for the unknown population mean , called the interval.

Note: Suppose that is unknown, and the population is either non-normal or of unknown distribution, and the sample size is not large. Then we should not use the interval. Instead, we need to turn to nonparametric methods, for example, the sign interval or the Wilcoxon interval. (See Chapter 14: Nonparametric Statistics, available online.)

Interval for

The interval for may be constructed whenever either of the following conditions is met:

  • The population is normal.
  • The sample size is large ().

Suppose a random sample of size is taken from a population with unknown mean and unknown standard deviation . A confidence interval for is given by the interval

where is the sample mean, is associated with the confidence level and degrees of freedom, and is the sample standard deviation. The interval may also be written as

and is denoted

452

EXAMPLE 13 Checking whether the conditions are met for the interval for

Never assume normality unless it is indicated or evidence for it exists.

For each of the following, we are taking a random sample from a population with unknown. Determine whether the conditions are met for constructing the indicated interval for . If not, explain why not.

  1. Confidence level 99%, , ,
  2. Confidence level 95%, , , , normal population

Solution

  1. The sample size is not large , and we are not told that the population is normal. Therefore, the conditions are not met for the interval for . It is not okay to construct the interval.
  2. Again the sample size is not large, but this time we are told that the population is normal. Thus, the conditions are met for the interval for . It is okay to construct the interval.

NOW YOU CAN DO

Exercises 9–12.

YOUR TURN #8

For each of the following, we are taking a random sample from a population with unknown. Determine whether the conditions are met for constructing the indicated interval for . If not, explain why not.

  1. Confidence level 95%, , ,
  2. Confidence level 95%, , ,

(The solutions are shown in Appendix A.)

EXAMPLE 14 Constructing a confidence interval for

cerealsodium

Research has shown that the amount of sodium consumed in food has been associated with hypertension (high blood pressure). The table provides a list of 16 breakfast cereals, along with their sodium contents, in milligrams per serving.

  1. Determine whether the conditions are met for constructing a interval for the population mean sodium content per serving for all breakfast cereals.
  2. Find the value of for 99% confidence and degrees of freedom .
  3. Construct a 99% confidence interval for the population mean sodium content.
  4. Interpret the meaning of this confidence interval.
Cereal Sodium
(grams)
Cereal Sodium
(grams)
Apple Jacks 125 Grape Nuts Flakes 140
Cap'n Crunch 220 Kix 260
Cinnamon Toast Crunch 210 Life 150
Corn Flakes 290 Lucky Charms 180
Count Chocula 180 Raisin Bran 210
Cream of Wheat 80 Rice Chex 240
Fruit Loops 125 Special K 230
Fruity Pebbles 135 Total Whole Grain 200

453

Solution

  1. Figure 17 contains the normal probability plot for the data set. Though not perfect, all points lie within the bounds, indicating acceptable normality. Thus, we proceed to construct the 99% confidence interval.
    image
    Figure 8.17: FIGURE 17 Normal probability plot for sodium in cereal.
  2. The value of for 99% confidence and 15 degrees of freedom is 2.947.
  3. A 99% confidence interval for is given by the interval

    From the Minitab output in Figure 18, we have , , and . Substituting, we get:

    image
    Figure 8.18: FIGURE 18 Minitab ouput.
  4. We are 99% confident that , the population mean sodium content per serving of all breakfast cereals, lies between 144.1 grams and 227.7 grams.

NOW YOU CAN DO

Exercises 13–28.

YOUR TURN #9

Find and interpret a 95% confidence interval for , which is the population mean sodium content per serving of all breakfast cereals.

(The solution is shown in Appendix A.)

Developing Your Statistical Sense

Intervals May Offer More Peace of Mind Than Intervals

In Example 14, if we had assumed that the population standard deviation was known , then the 99% interval for the population mean amount of sodium would have been

Note that this interval (149.3, 222.5) is only slightly more precise than the interval (144.1, 227.7). However, the interval depends on prior knowledge of the value of . If the value of is inaccurate, then the interval will be misleading and overly optimistic. With even moderate sample sizes, reporting the interval instead of the interval may offer peace of mind to the data analyst.

454

If the degrees of freedom needed to find do not appear in the df column of the table, a conservative solution is to take the next row with smaller df in the table. For example, if we have a data set such that , we find that is not in the table. Instead, we assign . (Even though 50 is closer, it will lead to an interval that overstates the precision in the data.) For , use the associated critical values, because the distribution approaches the distribution as gets very large.

Margin of Error

Recall that the margin of error for the interval equals . For the interval, because is unknown, the margin of error is given as follows.

Margin of Error for the Interval

The margin of error for a interval for can be interpreted as follows: “We can estimate to within units with confidence.”

EXAMPLE 15 Margin of error

Use the statistics observed in Example 14.

  1. Find the margin of error for the 99% confidence interval for mean sodium content per serving of all breakfast cereals.
  2. Interpret the margin of error.

Solution

  1. From Example 14c, we have:

    The margin of error for mean sodium content is 41.8 grams.

  2. We can estimate the population mean sodium content per serving of all breakfast cereals to within 41.8 grams with 99% confidence.

NOW YOU CAN DO

Exercises 29–40.

YOUR TURN #10

Find and interpret the margin of error for the 95% confidence interval for mean sodium content found in the Your Turn #9 after Example 14.

(The solution is shown in Appendix A.)

What Does the Margin of Error Mean?

The margin of error provides an indication of the accuracy of the confidence interval estimate for confidence level = 99%. That is, if we repeatedly take many samples of size 16 breakfast cereals, our sample mean will be within of the unknown population mean in 99% of those samples.

EXAMPLE 16 intervals for using technology

cerealsodium

For the breakfast cereal data in Example 14, construct a 99% confidence interval for the population mean sodium content, using the TI-83/84, Minitab, and SPSS.

455

Solution

We use the instructions provided in the Step-by-Step Technology Guide below. The sample size is not large (≤30), so it is necessary to check for normality. Figure 17 indicates acceptable normality.

The results for the TI-83/84 in Figure 19 display the 95% confidence interval for the population mean sodium content to be

image
Figure 8.19: FIGURE 19 TI-83/84 results.

They also show the sample mean , the sample standard deviation , and the sample size .

The Minitab results are shown in Figure 20, providing the sample size , the sample mean , the sample standard deviation , the standard error (SE mean) 14.2, and the 95% confidence interval (155.7, 216.2).

image
Figure 8.20: FIGURE 20 Minitab results.

The SPSS results are shown in Figure 21, providing the sample mean , the standard error 14.20254, and the 95% confidence interval (155.6655, 216.2095).

image
Figure 8.21: FIGURE 21 SPSS results.