Question 6.94

59. In issue 49 of Stats: The Magazine for Students of Statistics, Schuyler Huck presents a dataset of 100 ordered pairs in which 25 of them are (17, 1), 25 are (18, 2), 25 are (19, 3), and 25 are (20, 4).

  1. Without doing much formal calculation, find the value of r and the slope of the least-squares regression line.
  2. Now, suppose someone adds the 101st point to the dataset: the ordered pair (1, 20). Predict the new value of r and the slope of the regression line, and then do a calculation to see how close your answer is.

59.

(a) The 100 points lie on a line with slope . Because the points lie on a line with positive slope, .

(b) Here is a scatterplot of the data with the additional data point. (Note: Each dot in the lower right represents data points.) Guesses for the slope and correlation will vary. However, the outlier will pull the line toward it, possibly turning the slope from positive to negative. If that happens, then the correlation will also be negative.

image

To find the value of , we first find and . (See Chapter 5, page 204, for the standard deviation formula.) First, we will need and :

Next, we find the squared deviations from the mean and then the standard deviations.

Times observed Observations Deviations Squared deviations
25 17
25 18
25 19
25 20
1 1

A-17

Repeating the process for , we find that .

Since , we have the following:

Next, we determine the slope using the formula .

However, since we know

Thus,