Least-squares regression fits a straight line to data in order to predict a response variabley from an explanatory variablex. Inference about regression requires additional conditions.
The simple linear regression model says that there is a populationregression lineμy=β0+β1x that describes how the mean response in an entire population varies as x changes. The observed response y for any x has a Normal distribution with mean given by the population regression line and with the same standard deviationσ for any value of x.
Page 503
The parameters of the simple linear regression model are the interceptβ0, the slopeβ1, and the model standard deviation σ. The slope b0 and intercept b1 of the least-squares line estimate the slope β0 and intercept β1 of the population regression line.
The parameterσ is estimated by the regression standard error
s=√1n−2∑(yi−ˆyi)2
where the differences between the observed and predicted responses are the residuals
ei=yi−ˆyi
Prior to inference, always examine the residuals for Normality, constant variance, and any other remaining patterns in the data. Plots of the residuals are commonly used as part of this examination.
The regression standard error s has n−2degrees of freedom. Inference about β0 and β1 uses t distributions with n−2 degrees of freedom.
Confidence intervals for the slope of the population regression line have the formb1±t*SEb1. In practice, use software to find the slope b1 of the least-squares line and its standard error SEb1.
To test the hypothesis that the population slope is zero, use the tstatistict=b1/SEb1, also given by software. This null hypothesis says that straight-line dependence on x has no value for predicting y.
The t test for zero population slope also tests the null hypothesis that the population correlation is zero. This t statistic can be expressed in terms of the sample correlation, t=r√n−2/√1−r2.