12.4 Review of Concepts
Using the F Distribution with Three or More Samples
We use the F statistic when we want to compare more than two means. As with the z and t statistics, the F statistic is calculated by dividing a measure of the differences among sample means (between-groups variance) by a measure of variability within the samples (within-groups variance). The hypothesis test based on the F statistic is called analysis of variance (ANOVA).
ANOVA offers a solution to the problem of having to run multiple t tests, because it allows for multiple comparisons in just one statistical analysis. There are several different types of ANOVA, and each has two descriptors. One indicates the number of independent variables, such as one-way ANOVA for one independent variable. The other indicates whether participants are in only one condition (between-groups ANOVA) or in every condition (within-groups ANOVA). The major assumptions for ANOVA are random selection of participants, normally distributed underlying populations, and homoscedasticity, which means that all populations have the same variance (versus heteroscedasticity, which means that the populations do not all have the same variance). As with previous statistical tests, most real-life analyses do not meet all of these assumptions.
One-Way Between-Groups ANOVA
The one-way between-groups ANOVA uses the six steps of hypothesis testing that we have already learned, but with some modifications, particularly to steps 3 and 5. Step 3 is simpler than with t tests; we only have to state that the comparison distribution is an F distribution and provide the degrees of freedom. In step 5, we calculate the F statistic; a source table helps us to keep track of the calculations. The F statistic is a ratio of two different estimates of population variance, both of distributions of scores rather than distributions of means. The denominator, within-groups variance, is similar to the pooled variance of the independent-samples t test; it’s basically a weighted average of the variance within each sample. The numerator, between-groups variance, is an estimate based on the difference between the sample means, but it is then inflated to represent a distribution of scores rather than a distribution of means. As part of the calculations of between-groups variance and within-groups variance, we need to calculate a grand mean, the mean score of every participant in the study.
A large between-groups variance and a small within-groups variance indicate a small degree of overlap among samples and likely a small degree of overlap among populations. A large between-groups variance divided by a small within-groups variance produces a large F statistic. If the F statistic is beyond a prescribed cutoff, or critical value, then we can reject the null hypothesis.
Beyond Hypothesis Testing for the One-Way Between-Groups ANOVA
It is also recommended, as with other hypothesis tests, that we calculate an effect size—usually R2—when we conduct an ANOVA. In addition, when we reject the null hypothesis in an ANOVA, we only know that at least one of the means is different from at least one other mean. But we do not know exactly where the differences lie until we conduct a post hoc test such as the Tukey HSD test.
The Bonferroni test is a more conservative post hoc test and is helpful to researchers who want to explore a data set while minimizing the probability of making a Type I error.