Multiple regression describes the linear relationship between one response variable y and more than one predictor variable, x1,x2,x3,…. The multiple regression equation is an extension of the regression equation: ˆy=b0+b1x1+b2x2+…+bkxk where k represents the number of x variables in the equation, and b1,b2,b3,⋯bk represent the multiple regression coefficients.
The multiple coefficient of determination R2 represents the proportion of the variability in the response y that is explained by the multiple regression equation. The adjusted coefficient of determination R2adj adjusts the value of R2 as a penalty for having too many unhelpful x variables in the equation.
The multiple regression model is an extension of the regression model from Section 13.1. The population multiple regression equation is y=β0+β1x1+β2x2+⋯+βkxk+ε. The F test is performed to assess the significance of the overall model.
To determine whether a particular x variable has a significant linear relationship with the response variable y, we perform the t test for the significance of that x variable. One may perform as many such t tests as there are x variables in the model, which is k assuming the overall F test is significant.
Dummy variables are 0/1 variables that allow, via recoding, categorical variables to be included in the multiple regression model.
The Strategy for Building a Multiple Regression Model brings together all we have learned about multiple regression modeling.