2.138 Simpson’s paradox and regression
Simpson’s paradox occurs when a relationship between variables within groups of observations reverses when all of the data are combined. The phenomenon is usually discussed in terms of categorical variables, but it also occurs in other settings. Here is an example:
y | x | Group | y | x | Group |
---|---|---|---|---|---|
10.1 | 1 | 1 | 18.3 | 6 | 2 |
8.9 | 2 | 1 | 17.1 | 7 | 2 |
8.0 | 3 | 1 | 16.2 | 8 | 2 |
6.9 | 4 | 1 | 15.1 | 9 | 2 |
6.1 | 5 | 1 | 14.3 | 10 | 2 |