15.26 Always plot your data! Table 15.2 presents four sets of data prepared by the statistician Frank Anscombe to illustrate the dangers of calculating without first plotting the data. All four sets have the same correlation and the same least-squares regression line to several decimal places. The regression equation is
(a) Make a scatterplot for each of the four data sets and draw the regression line on each of the plots. (To draw the regression line, substitute and into the equation. Find the predicted for each . Plot these two points and draw the line through them on all four plots.)
(b) In which of the four cases would you be willing to use the regression line to predict given that ? Explain your answer in each case.
Data Set A | |||||||||||
x | 10 | 8 | 13 | 9 | 11 | 14 | 6 | 4 | 12 | 7 | 5 |
y | 8.04 | 6.95 | 7.58 | 8.81 | 8.33 | 9.96 | 7.24 | 4.26 | 10.84 | 4.82 | 5.68 |
Data Set B | |||||||||||
x | 10 | 8 | 13 | 9 | 11 | 14 | 6 | 4 | 12 | 7 | 5 |
y | 9.14 | 8.14 | 8.74 | 8.77 | 9.26 | 8.10 | 6.13 | 3.10 | 9.13 | 7.26 | 4.74 |
Data Set C | |||||||||||
x | 10 | 8 | 13 | 9 | 11 | 14 | 6 | 4 | 12 | 7 | 5 |
y | 7.46 | 6.77 | 12.74 | 7.11 | 7.81 | 8.84 | 6.08 | 5.39 | 8.15 | 6.42 | 5.73 |
Data Set D | |||||||||||
x | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 19 |
y | 6.58 | 5.76 | 7.71 | 8.84 | 8.47 | 7.04 | 5.25 | 5.56 | 7.91 | 6.89 | 12.50 |
Source: Frank J. Anscombe, “Graphs in statistical analysis,” The American Statistician, 27 (1973), pp. 17–21. |