SECTION 2.4 Summary
- Correlation and regression must be interpreted with caution. Plot the data to be sure the relationship is roughly linear and to detect outliers and influential observations.
- Avoid extrapolation, the use of a regression line for prediction for values of the explanatory variable far outside the range of the data from which the line was calculated.
- Remember that correlations based on averages are usually too high when applied to individual cases.
- Lurking variables that you did not measure may explain the relations between the variables you did measure. Correlation and regression can be misleading if you ignore important lurking variables.
- Most of all, be careful not to conclude that there is a cause-and-effect relationship between two variables just because they are strongly associated.High correlation does not imply causation. The best evidence that an association is due to causation comes from an experiment in which the explanatory variable is directly changed and other influences on the response are controlled.