11.25 Checking for a polynomial relationship. When looking at the residuals from the simple linear model of BMI versus physical activity (PA), Figure 10.5 (page 566) suggested a possible curvilinear relationship. Let’s investigate this further. Multiple regression can be used to fit the polynomial curve of degree q, y = β0 + β1x + β2x2 + … + βqxq, through the creation of additional explanatory variables x2, x3, etc. Let’s investigate a quadratic fit (q = 2) for the physical activity problem.
(a) It is often best to subtract the sample mean before creating the necessary explanatory variables. In this case, the average number of steps per day is 8.614. Create new explanatory variables x1 = (PA − 8.614) and x2 = (PA − 8.614)2 and run a multiple regression for BMI using the explanatory variables x1 and x2. Write down the fitted regression line.
(b) The regression model that included only PA had a R2 = 14.9%. What is R2 with the inclusion of this quadratic term?
(c) Obtain the residuals from part (a) and check the multiple regression assumptions. Are there any remaining patterns in the data? Are the residuals approximately Normal? Explain.
(d) Test the hypothesis that the coefficient of the variable (PA − 8.614)2 is equal to 0. Report the t statistic, degrees of freedom, and P-value. Does the quadratic term contribute significantly to the fit? Explain your answer.