Exercises 11.98 through 11.104 use the CROPS data file, which contains the U.S. yield (bushels/acre) of corn and soybeans from 1957-2013.23

Question 11.101

11.101 Try a quadratic.

We need a new variable to model the curved relation that we see between corn yield and year in the residual plot of the last exercise. Let . (When adding a squared term to a multiple regression model, we sometimes subtract the mean of the variable being squared before squaring. This eliminates the correlation between the linear and quadratic terms in the model and thereby reduces collinearity.)

crops

  1. Run the multiple linear regression using year, year2, and soybean yield to predict corn yield. Give the fitted regression equation.
  2. Give the null and alternative hypotheses for the ANOVA test. Report the results of this test, giving the test statistic, degrees of freedom, -value, and conclusion.
  3. What percent of the variation in corn yield is explained by this multiple regression? Compare this with the model in the previous exercise.
  4. Summarize the results of the significance tests for the individual regression coefficients.
  5. Analyze the residuals and summarize your conclusions.

11.101

(a) . (b) At least one are 3 and . There is a significant multiple linear regression between corn yield and the predictors’ Year, Year2, and SoyBeanYield. Together, the predictors can significantly predict corn yield. (c) , up from 93.82%. (d) For Year: . Year is significant in predicting corn yield in a model already containing Year2 and SoyBeanYield. For Year2: . Year2 is significant in predicting corn yield in a model already containing Year and SoyBeanYield. For SoyBeanYield: . Soy- BeanYield is significant in predicting corn yield in a model already containing Year and Year2. (e) The Normal quantile plot shows a roughly Normal distribution; there is one observation with a fairly high residual. The residual plots all look good (random); the residual plot for Year is much better and doesn’t have the rising and falling that the previous plot had. Overall, the model fit is much better using the quadratic term for Year than without.