OBJECTIVES By the end of this section, I will be able to …
1 The F Distribution and the F Test
In Sections 10.1–10.3, we were introduced to inference methods for comparing two population means and two population proportions. Here, we learn how to perform hypothesis tests regarding two population standard deviations. Wall Street investors are wary of excessive stock price variability. In this section, we will compare the variability of prices between two tech stocks, Google and Apple, using a new hypothesis test, called the F test. The F test will determine whether there is a significant difference in the variability of the stock prices, as measured by the respective population standard deviations. The F test is based on the F distribution, named in honor of the “grandfather of statistics,” Sir Ronald A. Fisher.
Let population 1 be Google stock prices and population 2 be Apple stock prices. We can test whether Google's stock prices are more variable than those of Apple; that is, we may test whether the standard deviation of Google stock prices, σ1, is greater than the standard deviation of Apple stock prices, σ2. This gives us the following hypotheses for our F test:
H0:σ1=σ2versusHa:σ1>σ2
Table 16 provides the three possible forms of hypotheses available when performing the F test for comparing two population standard deviations for two populations with standard deviations σ1 and σ2, respectively.
Form of test | Null hypothesis | Alternative hypothesis |
---|---|---|
Right-tailed test | H0:σ1=σ2 | Ha:σ1>σ2 |
Left-tailed test | H0:σ1=σ2 | Ha:σ1<σ2 |
Two-tailed test | H0:σ1=σ2 | Ha:σ1≠σ2 |
The requirements for performing the F test are the following:
The test statistic for the F test is Fdata, given as follows.
Test Statistic for the F Test for Comparing Two Population Standard Deviations
Suppose that the two population variances are equal, σ21=σ22, and that we have independent random samples of size n1 and n2 from two normally distributed populations with sample variances s21 and s22, respectively. Then the test statistic for the F test
Fdata=s21s22
follows an F distribution with n1−1 degrees of freedom in the numerator and n2−1 degrees of freedom in the denominator.
Let's become better acquainted with the F distribution. Similar to the Χ2 distribution, the F distribution is right-skewed, never takes negative values, and has an infinite number of different F curves (Figure 20). The shape of the curve depends on degrees of freedom.
As noted, the F distribution resembles the Χ2 distribution. This is not surprising because the values of the F distribution represent ratios of two Χ2 distributions. Moreover, the F distribution has two different degrees of freedom, which we will call df1 and df2, derived from the degrees of freedom of the two Χ2 distributions represented in the ratio. Often, df1 is called the numerator degrees of freedom, and df2 is called the denominator degrees of freedom.
Properties of the F Curve
The F distribution is continuous, so we can find probabilities associated with values of F, and vice versa, just as we did with the normal, t, and Χ2 distributions. Just as for any continuous distribution, probability is represented by the area below the F curve above an interval.
2 Perform the F Test for Comparing Two Population Standard Deviations: Critical-Value Method
To perform the hypothesis tests in this section, as well as in Chapters 12 and 13, we need to find the critical values of an F distribution for a given level of significance α. For example, we may need to find the value of an F distribution that has area α=0.05 to the right of it, or we may need to find the value of an F distribution that has area α=0.01 to the left of it. To find these F critical values, we will work with the F tables (see Appendix Table F). The F tables are somewhat different from the other tables that we have worked with so far.
The notation
Fcrit=Fα,n1−1,n2−1
represents the critical value of the F distribution with df1=n1−1 numerator degrees of freedom and df2=n2−1 denominator degrees of freedom, with area α to the right of Fα,n1−1,n2−1. For example, F0.05,15,10 represents the value of the F distribution with df1=15 and df2=10, with area 0.05 to the right of it. Next, we learn how to find the F critical values using the F tables.
Procedure for Finding F Critical Value for a Given Area α to the Right of it
Suppose we have an F distribution with df1 and df2 degrees of freedom. To find the critical value Fcrit that has area α to the right of it, do the following:
Note: When the degrees of freedom are not listed in the F table, we do not necessarily take the closest degrees of freedom we can find. This is because, sometimes, the closest degree of freedom is larger than the original, which leads to misleadingly overprecise results—α level of precision not warranted by the data. For example, this could lead us to find significance where none actually exists.
Developing Your Statistical Sense
Degrees of Freedom Not Listed in the Table
Just as with the t table, not all the degrees of freedom are listed in the F table. If either of the degrees of freedom df1 or df2 are not listed in the F table, a conservative solution is to take the next smallest value for whichever of df1 or df2 is not listed. For example, suppose df1=n1−1=57−1=56 and df2=n2−1=170−1=169. Neither df1=56 nor df2=169 is given in the F table. Therefore, we set df to be the next smallest value given in the table, df1=50, and we set df2 to be the next smallest value, df2=160.
The F tables give only the F critical values for a given area α to the right. To find F critical values for a given area α to the left, we use the following property:
F1−α,n1−1,n2−1=1Fα,n2−1,n1−1
In other words, the value from an F distribution with degrees of freedom df1=n1−1 and df2=n2−1 and area α to the left of it equals the reciprocal of the value from an F distribution with degrees of freedom df1=n2−1 and df2=n1−1 and area α to the right of it. Note that the two degrees of freedom get switched.
Procedure for Finding F Critical Value for a Given Area α to the Left of it
So, for example, to find the value of an F distribution with degrees of freedom df1=10 and df2=15 with area α=0.05 to the left of it, follow steps 1 and 2 above to find the value of an F distribution with df1=15 and df2=10 with area α=0.05 to the right of it. Then compute the reciprocal.
EXAMPLE 19 Finding critical values of the F distribution
Use the excerpt from the F distribution tables in Figure 21 to find the following critical values of the F distribution:
Solution
Step 2. Next to the 7 is a range of α values from 0.100 to 0.001. We choose the row with α=0.05. The F-value in that row and column is 4.74. Thus, our F critical value is
Fcrit=F0.05,2,7=4.74
Fcrit=F0.99,6,3=1F0.01,6,3=19.78=0.1023
NOW YOU CAN DO
Exercises 9–20.
Now we use the critical values of the F distribution to help us with the critical-value method of performing the F test for comparing two population standard deviations. Later we show the steps for the p-value method, along with an example.
F Test for Comparing Two Population Standard Deviations: Critical-Value Method
Suppose we have two independent random samples of size n1 and n2 taken from two normally distributed populations, with population standard deviations σ1 and σ2, and sample standard deviations s1 and s2, respectively.
Step 3 Find Fdata.
Fdata=s21s22
follows an F distribution with df1=n1−1 and df2=n2−1.
![]() |
We now return to our example comparing the variability of stock prices between Google and Apple.
EXAMPLE 20 F test for comparing two population standard deviations: Critical-value method
Table 18 shows independent samples from Google and Apple stock prices from July 2014 together with the sample sizes and sample standard deviations. Test, using the critical-value method, whether the standard deviation of Google stock prices σ1 is greater than the standard deviation of Apple stock prices σ2.
Apple | |
---|---|
574.79 590.76 583.04 593.06 580.82 599.02 579.55 587.78 |
93.52 94.03 95.39 96.45 95.60 99.02 97.03 |
n1=8s1≈7.999701 | n2=7s2≈1.862594 |
Solution
The normal probability plots in Figures 22a and 22b show acceptable normality for both samples. We may, therefore, proceed with the F test for comparing population standard deviations.
Step 1 State the hypotheses. We are testing whether Google's stock prices are more variable than those of Apple. Thus, because Google represents population 1, we have the following hypotheses for our F test:
H0:σ1=σ2versusHa:σ1>σ2
where σ1 represents the standard deviation of Google stock prices and σ2 represents the standard deviation of Apple stock prices. Use level of significance α=0.05.
Step 2 Find the critical value and state the rejection rule. We have df1=n1−1=7 and df2=n2−1=6. From Table 17 and Appendix Table F, our critical value is the F-value with area α=0.05 to the right of it:
Fcrit=Fα,n1−1,n2−1=F0,0.5,7.6=4.21
Our rejection rule is, therefore, from Table 17: Reject H0 if Fdata>4.21.
Step 3 Find Fdata
Fdata=S21S22=7.99970121.8625942≈18.45
follows an F distribution with df1=n1−1=n1−1=7 and df2=n2−1=n2−1=6.
NOW YOU CAN DO
Exercises 21–26.
3 Perform the F Test for Comparing Two Population Standard Deviations: p-Value Method
We may also use the p-value method to perform the F test for comparing two population standard deviations. The requirements are the same.
F Test for comparing two Population Standard Deviations: p -Value Method
Suppose we have two independent random samples of size n1, and n2 taken from two normally distributed populations, with population standard deviations σ1 and σ2, and sample standard deviations s1 and s2, respectively.
Step 2 Find Fdata.
Fdata=S21S22
follows an F distribution with df1=n1−1 and df2=n2−1.
![]() |
We illustrate the p-value method with an example.
EXAMPLE 21 F Test for comparing two population standard deviations: p-value method
The Web site Medicare.gov publishes survey information on patient attitudes about their level of care. Table 20 shows the percentages of respondents taken from independent random samples of hospitals in Florida and Georgia, which reported that their nurses always communicated well.
Florida | Georgia |
---|---|
67 66 70 70 72 73 | 72 75 78 73 68 71 |
63 69 65 68 65 | 82 72 77 75 73 |
n1=11s1=3.1305 | n2=11s2=3.8162 |
Test whether there is a difference in variability between the two states, using α=0.10.
Solution
Because the normal probability plots in Figures 23a and 23b show acceptable normality, we may therefore proceed with the F test for comparing population standard deviations.
Step 1 State the hypotheses. We are testing whether there is a difference in the standard deviation of the percentages for Florida (σ1) and Georgia (σ2). We therefore have a two-tailed test:
H0:σ1=σ2versusHa:σ1≠σ2
where σ1 and σ2 represent the standard deviations of the percent of Florida and Georgia respondents, respectively, who reported that their nurses always communicated well. Use level of significance α=0.10.
Step 2 Find Fdata.
Fdata=s21s22=3.130523.81622≈0.6729
follows an F distribution with df1=n1−1=10 and df2=n2−1=10.
Step 3 Find the p-value. Because we have a two-tailed test, Table 19 states that the p-value is
the smaller of (i)2⋅P(F>Fdata)and(ii)2⋅P(F<Fdata)
Figure 24a shows the output from the TI-83/84, giving P(F>Fdata) as the area under the F distribution curve between 0.6729 and infinity. This gives
2⋅P(F>Fdata)=2⋅0.7287420993=1.457484199
This cannot represent a valid p-value, because it is larger than 1. Figure 24b shows the output from the TI-83/84, giving P(F<Fdata) as the area under the F distribution curve between 0 and 0.6729. Thus, our p-value is:
p-value=2⋅P(F<Fdata)=2⋅P(F<0.6729)=2⋅0.2712579007≈0.5425
because this is smaller than 1.457484199.
NOW YOU CAN DO
Exercises 27–32.