10 Inference for Regression

SECTION 10.1 Summary

Least-squares regression fits a straight line to data in order to predict a response variable $y$ from an explanatory variable $x$ . Inference about regression requires additional conditions.
The simple linear regression model says that there is a population regression line $μ_{y} = β_{0} + β_{1} x$ that describes how the mean response in an entire population varies as $x$ changes. The observed response $y$ for any $x$ has a Normal distribution with mean given by the population regression line and with the same standard deviation $σ$ for any value of $x$ .

Page 503

The parameters of the simple linear regression model are the intercept $β_{0}$ , the slope $β_{1}$ , and the model standard deviation $σ$ . The slope $b_{0}$ and intercept $b_{1}$ of the least-squares line estimate the slope $β_{0}$ and intercept $β_{1}$ of the population regression line.
The parameter $σ$ is estimated by the regression standard error

$s = \sqrt{\frac{1}{n - 2} \sum {(y_{i} - {\hat{y}}_{i})}^{2}}$

where the differences between the observed and predicted responses are the residuals

$e_{i} = y_{i} - {\hat{y}}_{i}$
Prior to inference, always examine the residuals for Normality, constant variance, and any other remaining patterns in the data. Plots of the residuals are commonly used as part of this examination.
The regression standard error $s$ has $n - 2$ degrees of freedom. Inference about $β_{0}$ and $β_{1}$ uses $t$ distributions with $n - 2$ degrees of freedom.
Confidence intervals for the slope of the population regression line have the form $b_{1} \pm t^{*} {SE}_{b_{1}}$ . In practice, use software to find the slope $b_{1}$ of the least-squares line and its standard error ${SE}_{b_{1}}$ .
To test the hypothesis that the population slope is zero, use the $t$ statistic $t = b_{1} / {SE}_{b_{1}}$ , also given by software. This null hypothesis says that straight-line dependence on $x$ has no value for predicting $y$ .
The $t$ test for zero population slope also tests the null hypothesis that the population correlation is zero. This $t$ statistic can be expressed in terms of the sample correlation, $t = r \sqrt{n - 2} / \sqrt{1 - r^{2}}$ .