12.1 One-Way Analysis of Variance (ANOVA)

666

OBJECTIVES By the end of this section, I will be able to …

  1. Explain how analysis of variance works.
  2. Perform one-way analysis of variance.

1 How Analysis of Variance (ANOVA) Works

image

Analysis of variance (ANOVA) is a hypothesis test for determining whether three or more means of different populations are equal. ANOVA works by comparing the variability between the samples to the variability within the samples.

Suppose we are interested in determining whether significant differences exist in grade point averages (GPAs) among residents of three dormitories, A, B, and C. Table 1 displays three random samples of GPAs of 10 residents from each dormitory.

Table 12.1: Table 1 Sample GPAs from Dorms A, B, and C
A 0.60 3.82 4.00 2.22 1.46 2.91 2.20 1.60 0.89 2.30
B 2.12 2.00 1.03 3.47 3.70 1.72 3.15 3.93 1.26 2.62
C 3.65 1.57 3.36 1.17 2.55 3.12 3.60 4.00 2.85 2.13

The sample mean GPA for Dormitory A is

Similarly, we can find the sample mean GPAs for the other dormitories: and . We note that the sample means are not equal. The question is: Are the population means equal? Let , , and represent the population mean GPAs for Dormitories A, B, and C, respectively. We are interested in the following hypotheses, where represents the population mean GPA for dormitory :

Sufficient differences in the sample means would represent evidence that the population means were not equal. The question is: What represents “sufficiently” different? We need something to compare against, such as the spread of each sample. One measure of spread or variability is the range:

We have

These ranges are rather large spreads, and there is a considerable amount of overlap among the different dormitory GPAs, as shown in Figure 1.

Figure 1 shows the difference among the means for the three dorm GPAs compared with the spread of each dorm's GPAs, as measured by the range. The red triangles represent the sample means, , , and . The spread of the sample means (shown by the red arrows) is much less than the spreads of the individual dorm GPAs (shown by the green arrows). Thus, the sample means , , and are not sufficiently different when compared against the spread of the GPAs. This graph would therefore not provide evidence to reject the null hypothesis that the population mean GPAs are all equal.

667

image
Figure 12.1: FIGURE 1 Comparison dotplot of GPAs for Dorms A, B, and C.

Now we make a similar comparison for the GPAs for Dormitories D, E, and F in Table 2.

Table 12.2: Table 2 Sample GPAs from Dorms D, E, and F
D 2.16 2.23 2.09 2.17 2.25 2.19 2.24 2.28 2.25 2.14
E 2.45 2.34 2.58 2.49 2.60 2.42 2.55 2.62 2.45 2.50
F 2.80 2.75 2.93 2.68 2.88 2.75 2.87 2.81 2.73 2.80

The sample mean GPAs for Dormitories D, E, and F are the same as those for Dormitories A, B, and C, respectively: , , and . Again, we are interested in whether the population means are equal.

Consider the comparison dotplot in Figure 2. There now seems to be better evidence for concluding that the three population means are not all equal. There is no overlap among the three samples because the spread within each dormitory is much smaller than for Dormitories A, B, and C.

Figure 2 shows the difference among the means for the three dorm GPAs compared with the range of each dorm's GPAs. The red triangles represent the sample means, , , and . The spread of the sample means (red arrows) is much greater than the spreads of the individual dorm GPAs (green arrows). Thus, the sample means , , and are sufficiently different when compared against the range of the GPAs. This graph would, therefore, provide some evidence to reject the null hypothesis that the population mean GPAs are all equal.

image
Figure 12.2: FIGURE 2 Comparison dotplot of GPAs for Dorms D, E, and F.

668

Note that we arrived at opposite conclusions for the two sets of dormitories, even though the sample means of the first group are identical to the sample means of the second group. Here is the key difference:

These are the types of comparisons that the ANOVA method makes.

Instead of using the range as the measure of spread, analysis of variance uses the standard deviation of the individual samples. Recall that samples with larger spread have larger standard deviations, just as they have larger ranges.

Developing Your Statistical Sense

How Does Analysis of Variance Work?

The key to how analysis of variance works is the following comparison. Compare

  1. the variability in the sample means—that is, how large the differences are between the sample means (indicated by the lengths of the red arrows in Figures 1 and 2)—with
  2. the variability within each sample—that is, the within-sample spreads (indicated by the lengths of the green arrows in Figures 1 and 2).

When (a) is much larger than (b), this is evidence that the population means are not all equal and that we should reject the null hypothesis. Thus, our analysis depends on measuring variability—and hence the term analysis of variance.

Just as for hypothesis-testing procedures from previous chapters, analysis of variance can be performed only if certain requirements are met.

Requirements for Performing Analysis of Variance

  1. Each of the populations is normally distributed.
  2. The variances of the populations are all equal.
  3. The samples are independently drawn.

Note: In analysis of variance, the null hypothesis always states that all the population means are equal and the alternative hypothesis always states that not all the population means are equal. Note that is not stating that the population means are all different. For to be true, it is sufficient for a single population mean to be different, even though all the other population means may be equal.

Our hypotheses for testing for the equality of the population mean GPA for Dormitories A, B, and C are

Let us stop for a moment to consider what these requirements and the hypotheses mean.

Putting all this together, assumes that the observations from each population come from the same normal distribution, with mean and variance .

669

Suppose we then take samples of size from each group. Fact 3 in Chapter 7 states that the sampling distribution of for a sample of size taken from a normal population with mean and standard deviation (that is, variance ) is also normal, with mean and standard deviation (that is, variance ), as shown in Figure 3. Each dormitory's GPA is assumed (under ) to come from the same sampling distribution, so we would expect the sample means to be fairly close together.

On the other hand, if is not true, then not all the population means are equal (Figure 4). In this case, there is no sampling distribution common to all sample means, so we would not expect the sample means to be close together. Note in Figure 4 that each distribution nevertheless has the same shape (normal) and spread (that is, variance) because of the requirements.

image
Figure 12.3: FIGURE 3 Common sampling distribution when is true.
image
Figure 12.4: FIGURE 4 No common sampling distribution when is not true.

Note: Normal probability plots were introduced in Chapter 7.

Procedure for Verifying the Requirements for Analysis of Variance

  • Step 1 Normality. Check that the data from each group are normally distributed, using normality probability plots.
  • Step 2 Equal Variances. Compute the sample standard deviation for each group to verify that the largest standard deviation is not larger than twice the smallest standard deviation.
  • Step 3 Independence. Verify that the samples drawn from each group are independently drawn.

EXAMPLE 1 Verify the requirements for performing an analysis of variance

dormitory

Verify the requirements for performing an analysis of variance using the hypotheses

where represents the population mean GPA for Dormitory , using data from Table 1.

Solution

  • Step 1 Normality. To verify that each of the populations is normally distributed, we examine normal probability plots of each sample, shown in Figure 5. Each plot indicates acceptable normality.

    image
    Figure 12.5: FIGURE 5 Normal probability plots verify normality requirement.

    670

  • Step 2 Equal Variances. To find the standard deviation for Dorm A, we first find

    Then

    We similarly find and . The largest, , is not larger than twice the smallest, . Thus, the equal variance require-ment is satisfied.

  • Step 3 Independence. Because the students are randomly sampled from each dormitory, with the selection of students in one dormitory not affecting the selection of students sampled from the other dormitories, the independence assumption is also validated.

Note: We retain many decimal places when calculating , , and because these values are used to calculate other quantities later on.

NOW YOU CAN DO

Exercises 7–10.

Note: This form for is a weighted mean with the weights being the sample sizes.

Assuming that is true, we estimate the common population mean using the overall sample mean, :

where there are samples and is the “total sample size” (sum of the sample sizes). The overall sample mean is simply the mean of all the observations from all the samples. For the special case when all the sample sizes are equal, the overall sample mean is simply the mean of the sample means,

EXAMPLE 2 Calculating the overall sample mean

For the sample GPA data given in Table 1 for Dorms A, B, and C, calculate the overall sample mean, .

Solution

We have dormitories, with sample mean GPAs , , . Also, , and . Thus,

All the sample sizes are equal, so we can also calculate as follows:

NOW YOU CAN DO

Exercises 11–14.

671

What Does This Number Mean?

is the mean GPA for all 30 students from all three samples. We can use as our estimate of the common population mean assumed in .

Recall that analysis of variance works by comparing the variability in the sample means to the variability within each sample. We use the following statistics to measure these variabilities.

The greater the distance between the sample means, the larger the MSTR.

The larger the standard deviation of the samples, the larger the MSE.

The mean square treatment (MSTR) measures the variability in the sample means. MSTR is the sample variance of the sample means, weighted by sample size.

where and are the sample size and mean of the th sample, is the overall sample mean, and there are populations.

The mean square error (MSE) measures the variability within the samples. MSE is the mean of the sample variances, weighted by sample size.

where and are the sample size and variance of the th sample, is the total sample size, and there are populations.

We compare MSTR to MSE by taking the ratio of these two quantities. This ratio MSTR/MSE follows the distribution that we learned about in Section 10.4.

The student may want to review the characteristics of the distribution in Section 10.4.

The test statistic for analysis of variance is

measures the variability among the sample means, compared to the variability within the samples. follows an distribution with and , when the following requirements are met: (1) each of the populations is normally distributed, (2) the variances of the populations are all equal, and (3) the samples are independently drawn.

The term mean square represents a weighted mean of quantities that are squared. Each mean square itself consists of two parts: the sum of squares in the numerator and the degrees of freedom in the denominator. The numerator for MSTR is called the sum of squares treatment (SSTR), and the numerator for MSE is called the sum of squares error (SSE).

The total sum of squares (SST) is found by adding SSTR and SSE:

The ANOVA table shown in Table 3 is a convenient way to display the various statistics calculated during an analysis of variance. Note that the quantities in the mean square column equal the ratio of the two columns to its left.

672

Table 12.3: Table 3 ANOVA table
Source of
variation
Sum of
squares
Degrees of
freedom
Mean square -test statistic
Treatment SSTR
Error SSE
Total SST

EXAMPLE 3 Constructing the ANOVA table

Use the summary statistics in Table 4 for the sample GPAs for Dorms A, B, and C to construct the ANOVA table.

Table 12.4: Table 4 Summary statistics for sample GPAs for Dorms A, B, and C
Dorm A Dorm B Dorm C
Mean
Standard deviation
Sample size

Solution

We have dormitories, and total sample size . Thus,

We summarize these calculations in the following ANOVA table, with the results rounded for clarity.

Source of variation Sum of squares Degrees of freedom Mean square -test statistic
Treatment SSTR = 1.8
Error SSE = 29.0288
Total SST = 30.8288

NOW YOU CAN DO

Exercises 15–22.

673

2 Performing One-way ANOVA

Now that we know how it works, we next learn how to perform ANOVA.

Remember: is not stating that the population means are all different.

One-way Analysis of Variance

We have taken random samples from each of populations and want to test whether the population means of the populations are all equal.

Required conditions:

  1. Each of the populations is normally distributed.
  2. The variances of the populations are all equal.
  3. The samples are independently drawn.
  • Step 1 State the hypotheses, and state the rejection rule.

    where the represent the population mean from each population. The rejection rule is Reject if the .

  • Step 2 Calculate .

    where

    follows an distribution with and if the required conditions are satisfied, where represents the total sample size.

  • Step 3 Find the -value. Use technology to find the , as shown in Figure 6.
  • Step 4 State the conclusion and the interpretation. Compare the -value with .
    image
    Figure 12.6: FIGURE 6 -Value for the one-way ANOVA test.

EXAMPLE 4 Performing one-way ANOVA using the -value method

Test, using level of significance , whether the population mean GPAs from Example 1 differ among the students in Dormitories A, B, and C.

What Result Might We Expect?

Recall that the comparison dotplot in Figure 1 (page 667) showed a large amount of overlap in the GPAs among dormitories A, B, and C. The large ranges illustrate the large within-dormitory spread of the GPAs for these dorms. When compared against this large within-sample variability, the variability in sample means may not seem large. Therefore, we might expect that the null hypothesis of no difference will not be rejected.

674

Solution

We already verified the requirements for performing the analysis of variance in Example 1.

  • Step 1 State the hypotheses, and state the rejection rule. Define the .

    where represents the population mean GPA of students from dormitory . The rejection rule is Reject if the .

  • Step 2 Calculate . From Example 3, we have MSTR = 0.9, MSE = 1.0751407407, and

    follows an distribution with and .

  • Step 3 Find the -value. We use the instructions provided in the Step-by-Step Technology Guide at the end of this section (page 679). From Figures 7 and 8, we have

    image
    Figure 12.7: FIGURE 7 .
    image
    Figure 12.8: FIGURE 8 TI-83/84 -value.
  • Step 4 State the conclusion and the interpretation. Compare the -value with . The -value of 0.4439 is not , so we do not reject . As expected, there is not enough evidence to conclude at level of significance that not all population mean GPAs are equal.

image When calculating the -value for analysis of variance, always retain as many decimal places in the value of as you can. This will make the -value as accurate as possible. Rounding too much will make the -value less accurate.

NOW YOU CAN DO

Exercises 23–28.

EXAMPLE 5 Performing one-way ANOVA using technology

image

Researchers from the Institute for Behavioral Genetics at the University of Colorado investigated the effect that the enzyme protein kinase C (PKC) has on anxiety in mice. The genotype for a particular gene in a mouse (or a human) consists of two alleles (copies) of each chromosome, one each from the father and mother. The investigators in the study separated the mice into three groups. In Group 0, neither of the mice's alleles for PKC produced the enzyme. In Group 1, one of the two alleles for PKC produced the enzyme and the other did not. In Group 2, both PKC alleles produced the enzyme. To measure the anxiety in the mice, scientists measured the time (in seconds) the mice spent in the “open-ended” sections of an elevated maze. It was surmised that mice spending more time in open-ended sections exhibit decreased anxiety. The data are provided in Table 5. Use technology to test, at , whether the population mean time spent in the open-ended sections of the maze was the same for all three groups.

675

micemaze

Table 12.6: Table 5 Time spent in open-ended section of maze
Group 0 Group 1 Group 2
15.8 14.4 5.2 7.6 10.6 9.2
16.5 25.7 8.7 10.4 6.4 14.5
37.7 26.9 0.0 7.7 2.7 11.1
28.7 21.7 22.2 13.4 11.8 3.5
5.8 15.2 5.5 2.2 0.4 8.0
13.7 26.5 8.4 9.5 13.9 20.7
19.2 20.5 17.2 0.0 0.0 0.0
2.5 11.9 16.5

What Result Might We Expect?

Figure 9 shows a plot of the time in open-ended sections for the mice in the three groups. Note that the Group 1 and Group 2 mice spent on average about the same amount of time in the open-ended sections but that Group 0 spent on average somewhat more time in the open-ended sections. This would tend to suggest that the null hypothesis that all three population means are equal should be rejected. Remember that to reject , it is sufficient for just one of the population means to be different.

image
Figure 12.9: FIGURE 9 Evidence that the population mean of Group 0 is larger than the others.

Solution

We use the instructions provided in the Step-by-Step Technology Guide at the end of this section (page 679). We frst verify whether the requirements are met.

  • The normal probability plots in Figure 10 indicate acceptable normality.
  • The group standard deviations are , , and . Thus, the largest standard deviation is not greater than twice the smaller, which verifies the equal variances requirement.

    image
    Figure 12.10: FIGURE 10 Normal probability plots.

    676

  • The selection of a mouse to a particular group did not affect the selection of mice to the other groups, so that the samples are independent.

Thus, we proceed with the one-way ANOVA.

where the represent the population mean time spent in the open-ended sections of the maze for each group.

Figure 11 contains the results from the TI-83/84, showing where each statistic corresponds to the ANOVA table structure in Table 3. We have , with a -value of "1.5320224E4" = 0.00015320224. This -value is less than , so we reject . There is evidence at level of significance that the population mean times in the open-ended sections of the maze are not equal for all three groups.

image
Figure 12.11: FIGURE 11 Correspondence between TI-83/84 ANOVA output and the ANOVA table.

Figure 12 contains the Excel ANOVA results, Figure 13 contains the Minitab ANOVA results, and Figure 14 contains the JMP ANOVA results. Values differ slightly due to rounding.

image
Figure 12.12: FIGURE 12 Excel ANOVA results.
image
Figure 12.13: FIGURE 13 Minitab ANOVA results.
image
Figure 12.14: FIGURE 14 JMP results.

One-way ANOVA may also be conducted using the critical-value method. The conditions are the same as for the -value method.

677

EXAMPLE 6 Performing one-way ANOVA using the critical-value method

micemaze

Use the data from Example 5 to test, using the critical-value method and level of significance , whether the population mean time spent in the open-ended sections of the maze was the same for all three groups.

Solution

The conditions for performing ANOVA were verified in Example 5.

  • Step 1 State the hypotheses.

    where the represent the population mean time spent in the open-ended sections of the maze for each group.

  • Step 2 Find the critical value and state the rejection rule. The one-way ANOVA test is a right-tailed test, so the -critical value is the value of the distribution for and that has area to the right of it (see Figure 16). Here, and . To find , we may use the F tables or technology. To find our using Excel, enter = FINV(0.01,2,42) in cell A1, as shown in Figure 15. Thus, . ANOVA is a right-tailed test, so we will reject if .
    image
    Figure 12.15: FIGURE 15 Using Excel to find the critical value.
  • Step 3 Calculate . From Example 5, we have .
  • Step 4 State the conclusion and interpretation. Because (Figure 16), we reject . There is evidence that not all population mean times spent in the open-ended sections of the maze are equal.
    image
    Figure 12.16: FIGURE 16 has area of to the right of it.

NOW YOU CAN DO

Exercises 29–30.

Developing Your Statistical Sense

Do Not Draw the Wrong Conclusion

Note that we did not conclude that all three population means are different. As long as one mean is sufficiently different from the other two, we would reject . Our conclusion was simply that the population means were not all equal.

Also, we cannot yet formally conclude that Group 0 has a larger population mean time than the other groups, even though Figure 9 seems to indicate so. All we can formally conclude at this point is that not all the population means are equal. In Section 12.2, we will learn multiple comparisons, which is the type of analysis needed to test whether the mean of Group 0 is larger than the others.

678

image

image Professors on Facebook

A recent study investigated whether the amount of information a professor posts about himself or herself (that is, self-disclosure) on the online social network Facebook is related to student motivation. A professor constructed three different Facebook sites, one offering low self-disclosure, one offering medium self-disclosure, and one offering high self-disclosure. For example, the low-disclosure site offered information only about her position at the university. The medium-disclosure site also showed the professor's favorite movies, books, and quotes. On the high-disclosure site, fictitious comments from “friends” were posted on “the Wall,” highlighting social gatherings.

Study participants (students not enrolled in the professor's courses) were then randomly assigned to access and browse one of the three Facebook sites, develop an impression of the professor, and complete the research questionnaire. Student motivation was measured using a set of 16 items, and the sum of the 16 items was calculated to form the total motivation score. The items measured student interest, involvement, stimulation, level of excitement, and whether the student was inspired or challenged. Use technology to test, at , whether the population mean motivation scores are equal for the three types of Facebook pages: low, medium, and high self-disclosure.

Solution

First, we verify whether the requirements are satisfied.

  • The normal probability plots in Figure 17 indicate acceptable normality.
  • The standard deviations are shown in blue in the Minitab output in Figure 18.
  • The lamest, , is not larger than twice the smallest, . Thus, the equal-variance requirement is satisfied.
  • The student participants were randomly selected for each level of self-disclosure, so the independence assumption is also validated.
    image
    Figure 12.17: FIGURE 17 Normal probability plots.
    image
    Figure 12.18: FIGURE 18 Minitab output for Facebook ANOVA.

679

We therefore proceed with the ANOVA. The hypotheses are

where represents a population mean motivation score for each self-disclosure level. Reject if the -value is less than .

From Figure 18, we get , with an associated -value of approximately 0.001 (shown in blue). This -value is less than , so we reject . There is evidence that not all population mean motivation scores are equal across all levels of self-disclosure. Informally, we may observe that the mean motivation score for the Facebook Web site with low self-disclosure seems lower than the other groups. We test this formally in Section 2.