8.4 Confidence Intervals for the Population Variance and Standard Deviation

OBJECTIVES By the end of this section, I will be able to …

  1. Describe the properties of the (chi-square) distribution, and find critical values for the distribution.
  2. Construct and interpret confidence intervals for the population variance and standard deviation.

We have seen how confidence intervals can be used to estimate the unknown value of a population mean or a population proportion. However, the variability of a population is also important. As we have learned, less variability is usually better. For example, a tool manufacturer relies on a quality control technician (who has a strong background in statistics) to make sure that the tools the company is making do not vary appreciably from the required specifications. Otherwise, the tools may be too large or too small. Data analysts therefore construct confidence intervals to estimate the unknown value of the population parameters that measure variability: the population variance and the population standard deviation .

474

We first need to become acquainted with the (chi-square) distribution, which is used to construct these confidence intervals.

1 Properties of the (chi-Square) Distribution

The (pronounced ky-square, to rhyme with “my square”) distribution was discovered in 1875 by the German physicist Friedrich Helmert and further developed in 1900 by the English statistician Karl Pearson. It is a continuous distribution, so the random variable is continuous.

Just as we did with the normal and distributions, we can find probabilities associated with values of , and vice versa. Similar to any continuous distribution, probability is represented by area below the curve above an interval. We examine the properties of the distribution and then learn how to use the table to find the critical values of the distribution.

Properties of the Distribution

  • Just as for any continuous random variable, the total area under the curve equals 1.
  • The value of the random variable is never negative, so the curve starts at 0. However, it extends indefinitely to the right, with no upper bound.
  • Because of the characteristics just described, the curve is right-skewed.
  • There is a different curve for every different degrees of freedom, . As the number of degrees of freedom increases, the curve begins to look more symmetric (Figure 34).
    image
    Figure 8.34: FIGURE 34 Shape of the distribution for different degrees of freedom.

To construct the confidence intervals in this section, we will need to find the critical values of a distribution for the given confidence level , using either the table (Table E in the Appendix) or technology. The table is somewhat similar to the table (Table D in the Appendix); both tables show the degrees of freedom in the left column. The area to the right of the critical value is given across the top of the table.

The distribution is not symmetric, so we cannot construct the confidence interval for using the “point estimate ± margin of error” method. Instead, the lower bound and upper bound for the confidence interval are determined using two critical values:

= the value of the distribution with area to its right (Figure 35)

= the value of the distribution with area to its right (Figure 35)

475

For instance, for a 95% confidence interval , and . Thus, represents the value of the distribution with area to the right of the critical value. The second critical value represents the value of the distribution with area to the right of the critical value.

image
Figure 8.35: FIGURE 35 critical values.

EXAMPLE 23 Finding the critical values

Note: If the appropriate degrees of freedom are not given in the table, the conservative solution is to take the next row with the smaller df.

Find critical values for a 90% confidence interval, where we have a sample size of size .

Solution

For a 90% confidence interval,

So we are seeking (1) , the critical value with area to the right of it, and (2) , the critical value with area to the right of it.

Because , the degrees of freedom is . To find for df = 9, go across the top of the table (Table E in the Appendix) until you see 0.95 (Figure 36). is somewhere in that column. Now go down that column until you see your number of degrees of freedom df = 9. Thus, for df = 9, . For a distribution with 9 degrees of freedom, there is area = 0.95 to the right of 3.325.

image
Figure 8.36: FIGURE 36 Finding and using the table.

476

Similarly, is found in the column labeled “0.05” and the row corresponding to . We find that , as shown in Figure 37.

image
Figure 8.37: FIGURE 37 critical values for the distribution with df = 9.

NOW YOU CAN DO

Exercises 9–16.

YOUR TURN #16

Find critical values for a 95% confidence interval, where we have a sample size of size .

(The solutions are shown in Appendix A.)

2 Constructing Confidence Intervals for the Population Variance and Standard Deviation

We derive the formula for a confidence interval for the population variance . Suppose we take a random sample of size from a normal population with mean and standard deviation . Then the statistic

follows a distribution with degrees of freedom, where represents the sample variance. From Figure 35, we see that of the values of lie between and . These values are described as

Rearranging this inequality so that is in the numerator gives us the formula for the confidence interval for :

Thus, the lower bound of the confidence interval for is , and the upper bound is . Taking the square root of each gives us the lower and upper bounds or the confidence interval for .

Confidence Interval for the Population Variance

Suppose we take a sample of size from a normal population with mean and standard deviation . Then a confidence interval for the population variance is given by

where represents the sample variance and and are the critical values for a distribution with degrees of freedom.

477

Confidence Interval for the Population Standard Deviation

A confidence interval for the population standard deviation is then given by

EXAMPLE 24 Constructing confidence intervals for the population variance and population standard deviation

image

image electricmiles

The accompanying table shows the miles-per-gallon equivalent (MPGe) for fve electric cars, as reported by www.hybridcars.com in 2014. The normal probability plot in Figure 38 indicates that the data are normally distributed.

image
Figure 8.38: FIGURE 38 Normal probability plot of miles-per-gallon equivalent for fve electric cars.
Electric Vehicle Mileage (MPGe)
Tesla Model S 89
Nissan Leaf 99
Ford Focus 105
Mitsubishi i-MiEV 112
Chevrolet Spark 119
  1. Find the critical values and for a confidence interval with a 95% confidence level.
  2. Construct and interpret a 95% confidence interval for the population variance of electric car MPG.
  3. Construct and interpret a 95% confidence interval for the population standard deviation of electric car MPG.

electricmiles

Solution

  1. There are electric cars in our sample, so the degrees of freedom equal .

    For a 95% confidence interval,

    From the table (Table E in the Appendix), therefore,

    Figures 39 through 41 show these results using Excel, Minitab, and JMP.

  2. Figure 42 shows the descriptive statistics for MPGe, as obtained by the TI-83/84. The sample standard deviation is .

    478

    image
    Figure 8.39: FIGURE 39 Excel results.
    image
    Figure 8.40: FIGURE 40 Minitab results.
    image
    Figure 8.41: FIGURE 41 JMP results.
    image
    Figure 8.42: FIGURE 42 TI-83/84 results.

    Thus, our 95% confidence interval for is given by

    We are 95% confident that the population variance lies between 48.17 and 1109.09 miles per gallon squared, that is, (MPG)2. (Recall that the variance is measured in units squared.) It is unclear what miles per gallon squared means, so we prefer to construct a confidence interval for the population standard deviation .

  3. Using the results from part (b),

We are 95% confident that the population standard deviation lies between 6.94 and 33.3 miles per gallon. Figure 43 shows the two confidence intervals obtained using Minitab.

image
Figure 8.43: FIGURE 43 Minitab results showing the confidence intervals.

Figure 44 shows the confidence interval for obtained using JMP. We are interested in the bottom row, which has the confidence interval for the population standard deviation .

image
Figure 8.44: FIGURE 44 JMP results showing the confidence interval for .

NOW YOU CAN DO

Exercises 17–32.