EXAMPLE 10 Extrapolation Is Risky!

Return to the data in Table 6.3 (page 245), which gives the times for an 8-year-old competitive swimmer’s 50-yard butterfly over 14 races (her time for race 10 was never recorded). A scatterplot of these data appears in Figure 6.3 (page 245). After removing the two circled outliers (corresponding to the times of the swimmer’s first two races), the form of the remaining data points appears to be a straight line. Figure 6.17 shows the result of using Excel (see Spotlight 6.5 on page 265) to make a scatterplot of the data and determine the equation of the least-squares regression line.

image
Figure 6.17: Figure 6.17 Fitting a least-squares regression line to data from a swimmer’s 50-yard butterfly races.

270

First, we’d like to predict the time for the 10th race, the race in which the swimmer’s time was never recorded:

Given the pattern in the surrounding data, a predicted time of 42.61 seconds seems reasonable. This is an example of interpolation, predicting a value of the response variable for an -value within the range of the observed -values.

The swimmer really wanted to be able to predict what her time would be after many races, say, for race 150 (she figured she’d be about 16 years old by that time):

Finishing a race 4.75 seconds before the race begins is clearly impossible! This is an example of extrapolation, predicting a value of the response variable for an -value that lies outside of the range of the observed -values. Just because the data fit a particular linear trend over a certain interval, there is no guarantee that that trend will continue into the future. So, avoid extrapolation—particularly for ’s far from the -values in the data.