Four sets of data prepared by the statistician Frank Anscombe illustrate the dangers of calculating without first plotting the data.11
Without making scatterplots, find the correlation and the least-squares regression line for all four data sets. What do you notice? Use the regression line to predict y for x = 10.
Make a scatterplot for each of the data sets, and add the regression line to each plot.
In which of the four cases would you be willing to use the regression line to describe the dependence of y on x? Explain your answer in each case.