How It Works

11.1 CONDUCTING A ONE-WAY BETWEEN-GROUPS ANOVA

Irwin and colleagues (2004) are among a growing number of behavioral health researchers who are interested in adherence to medical regimens. These researchers studied adherence to an exercise regimen over one year in postmenopausal women, who are at increased risk for medical problems that may be reduced by exercise. Among the many factors that the research team examined was attendance at a monthly group education program that taught tactics to change exercise behavior; the researchers kept attendance and divided participants into three categories based on the number of sessions they attended. (Note: The researchers could have kept the data as numbers of sessions, a scale variable, rather than dividing them into categories based on numbers of sessions, an ordinal variable.)

Here is an abbreviated version of this study with fictional data points; the means of these data points, however, are the actual means of the study.

< 5 sessions: 155, 120, 130

5–8 sessions: 199, 160, 184

9–12 sessions: 230, 214, 195, 209

In this study, the independent variable was attendance, with three levels: <5 sessions, 5–8 sessions, and 9–12 sessions. The dependent variable was number of minutes of exercise per week. So we have one ordinal independent variable with three between-groups levels and one scale dependent variable. How can we conduct a one-way between-groups ANOVA?

Summary of Step 1

Population 1: Postmenopausal women who attended fewer than 5 sessions of a group exercise-education program. Population 2: Postmenopausal women who attended 5–8 sessions of a group exercise-education program. Population 3: Postmenopausal women who attended 9–12 sessions of a group exercise-education program.

312

The comparison distribution will be an F distribution. The hypothesis test will be a one-way between-groups ANOVA. The data were not selected randomly, so we must generalize only with caution. We do not know if the underlying population distributions are normal, but the sample data do not indicate severe skew. To see if we meet the homoscedasticity assumption, we will check to see if the largest variance is no greater than twice the smallest variance. From the calculations below, we see that the largest variance, 387, is not more than twice the smallest, 208.67, so we have met the homoscedasticity assumption. (The following information is taken from the calculation of SSwithin.)

Sample < 5 5–8 9–12
Squared deviations 400 324 324
225 441 4
25 9 289
9
Sum of squares 650 774 626
N – 1 2 2 3
Variance 325 387 208.67

Summary of Step 2

Null hypothesis: Postmenopausal women in different categories of attendance at a group exercise-education program exercise the same average number of minutes per week—H0: μ1 = μ2 = μ3. Research hypothesis: Postmenopausal women in different categories of attendance at a group exercise-education program do not exercise the same average number of minutes per week—H1 is that at least one μ is different from another μ.

Summary of Step 3

dfbetween = Ngroups − 1 = 3 − 1 = 2

df1 = 3 − 1 = 2; df2 = 3 − 1 = 2; df3 = 4 − 1 = 3

dfwithin = 2 + 2 + 3 = 7

The comparison distribution will be the F distribution with 2 and 7 degrees of freedom.

Summary of Step 4

The critical F statistic based on a p level of 0.05 is 4.74.

Summary of Step 5

dftotal = 2 + 7 = 9 or dftotal = 10 − 1 = 9

SStotal = Σ(XGM)2 = 12,222.40

Sample X (X GM) (X GM)2
< 5 155 −24.6 605.16
M<5 = 135 120 −59.6 3552.16
130 −49.6 2460.16
5–8 199 19.4 376.36
M5–8 = 181 160 −19.6 384.16
184 4.4 19.36
9–12 230 50.4 2540.16
M9–12 = 212 214 34.4 1183.36
195 15.4 237.16
209 29.4 864.36
GM = 179.60 SStotal = 12,222.40

313

SSwithin = ∑ (XM)2 = 2050.00

Sample X (X M) (X M)2
< 5 155 20 400
M<5 = 135 120 −15 225
130 −5 25
5–8 199 18 324
M5–8 = 181 160 −21 441
184 3 9
9–12 230 18 324
M9–12 = 212 214 2 4
195 −17 289
209 −3 9
GM = 179.60 SSwithin = 2050.00

SSbetween = ∑(MGM)2 = 10,172.40

Sample X (M GM) (M GM)2
< 5 155 −44.6 1989.16
M<5 = 135 120 −44.6 1989.16
130 −44.6 1989.16
5–8 199 1.4 1.96
M5–8 = 181 160 1.4 1.96
184 1.4 1.96
9–12 230 32.4 1049.76
M9–12 = 212 214 32.4 1049.76
195 32.4 1049.76
209 32.4 1049.76
GM = 179.60 SSbetween = 10,172.40
image
Source SS df MS F
Between 10,172.40 2 5,086.200 17.37
Within 2,050.00 7 292.857
Total 12,222.40 9

Summary of Step 6

The F statistic, 17.37, is beyond the cutoff of 4.74. We can reject the null hypothesis. It appears that postmenopausal women in different categories of attendance at a group exercise-education program do exercise a different average number of minutes per week. However, the results from this ANOVA do not tell us where specific differences lie. The ANOVA tells us only that there is at least one difference between means. We must calculate a post hoc test to determine exactly which pairs of means are different.