• You can examine the fit of a regression line by plotting the residuals, which are the differences between the observed and predicted values of y. Be on the lookout for points with unusually large residuals and also for nonlinear patterns and uneven variation about the line.
• Also look for influential observations, individual points that substantially change the regression line. Influential observations are often outliers in the x direction, but they need not have large residuals.
• Correlation and regression must be interpreted with caution. Plot the data to be sure that the relationship is roughly linear and to detect outliers and influential observations.
• Lurking variables may explain the relationship between the explanatory and response variables. Correlation and regression can be misleading if you ignore important lurking variables.
• We cannot conclude that there is a cause-and-effect relationship between two variables just because they are strongly associated. High correlation does not imply causation.
• A correlation based on averages is usually higher than if we used data for individuals.