Section 13.3 Exercises

CLARIFYING THE CONCEPTS

Question 13.103

1. Write the multiple regression equation for predictor variables. (p. 743)

13.3.1

Question 13.104

2. Which is preferable, or , and why? (p. 745)

Question 13.105

3. Which test do we perform if we want to determine whether our multiple regression is useful? (p. 746)

13.3.3

The test for the overall significance of the multiple regression

Question 13.106

4. If we conclude from the test thatour multiple regression is useful, is it still possible that one of the s equals zero? Explain. (p. 746)

Question 13.107

5. Explain the difference between the test and the test we learned in this section. (p. 747)

13.3.5

The test is for the overall significance of the multiple regression and the test is for testing whether a particular -variable has a significant relationship with the response variable .

Question 13.108

6. How many tests may we perform for a multiple regression model? (p. 747)

Question 13.109

7. How do we interpret the coefficient for a dummy variable. (Hint: Consider Figure 29.) (p. 750)

13.3.7

The coefficient of a dummy variable can be interpreted as the estimated increase in for those observations with the value of the dummy variable equal to 1 as compared to those with the value of the dummy variable equal to 0 when all of the other variables are held constant.

Question 13.110

8. What are the four steps of the Strategy for Building a Multiple Regression Model. (p. 750)

PRACTICING THE TECHNIQUES

image CHECK IT OUT!

To do Check out Topic
Exercises 9–16 Example 11 Multiple regression
equation, coefficients,
and predictions
Exercises 17–20 Example 12 Calculating and
interpreting the
adjusted coefficient of
determination
Exercises 21–22
and 27–28
Example 13 test for the overall
significance of the
multiple regression
Exercises 23
and 30
Example 14 Performing a set of
tests for the significance of a set of individual
variables
Exercise 29 Example 15 Dummy variables in
multiple regression
Exercises 24–26
and 31–33
Example 16 Strategy for building a
multiple regression model

Use the following information for Exercises 9–12: A multiple regression model has been produced for a set of observations with multiple regression equation , with multiple coefficient of determination .

Assume the regression assumptions are met.

Question 13.111

9. Interpret the value of the coefficient for .

13.3.9

For each increase in one unit of the variable , the estimated value of increases by 5 units when the value of is held constant.

Question 13.112

10. Explain what the value of means.

Question 13.113

11. Interpret the coefficients , and .

13.3.11

The estimated value of when and is . means that for each increase of one unit of the variable , the estimated value of increases by 5 units when the value of is held constant. means that for each increase of one unit of the variable , the estimated value of increases by 8 units when the value of is held constant.

Question 13.114

12. Find point estimates of for the following values of and :

Use the following information for Exercises 13–16: A multiple regression model has been produced for a set of observations with multiple regression equation , with multiple coefficient of determination . Assume the regression assumptions are met.

Question 13.115

13. Interpret the value of the coefficient for .

13.3.13

For each increase in one unit of the variable , the estimated value of decreases by 0.1 unit when the value of is held constant.

Question 13.116

14. Explain what the value of means.

Question 13.117

15. Interpret the coefficients , and .

13.3.15

The estimated value of when and is . means that for each increase of one unit of the variable , the estimated value of decreases by 0.1 unit when the value of is held constant. means that for each increase of one unit of the variable , the estimated value of increases by 0.9 unit when the value of is held constant.

Question 13.118

16. Find point estimates of for the following values of and :

Question 13.119

17. For the data in Exercises 9–12, how should the value of be interpreted?

13.3.17

50% of the variability in is accounted for by this multiple regression equation.

Question 13.120

18. Calculate for the data in Exercises 9–12.

Question 13.121

19. For the data in Exercises 13–16, how should the value of be interpreted?

13.3.19

75% of the variability in is accounted for by this multiple regression equation.

Question 13.122

20. Calculate for the data in Exercises 13–16.

755

Use the following data set for Exercises 21-26.

0.6 1 10 1.3
4.0 2 10 −3.2
3.2 3 8 −1.0
9.0 4 8 0.9
1.8 5 6 −2.5
8.4 6 6 0.9
9.8 7 4 1.0
10.4 8 4 2.0
8.8 9 2 0.2
14.7 10 2 −2.2

Question 13.123

21. Perform the multiple regression of on , , and , and write the multiple regression equation.

13.3.21

Question 13.124

22. Assume the regression assumptions are met. Perform the test for the significance of the overall regression, using level of significance . Do the following:

  1. State the hypotheses and the rejection rule.
  2. Find the statistic and the value.
  3. State the conclusion and interpretation.

Question 13.125

23. Perform the test for the significance of the individual predictor variables, using level of significance . Do the following:

  1. For each hypothesis test, state the hypotheses and the rejection rule.
  2. For each hypothesis test, find the statistic and the -value.
  3. For each hypothesis test, state the conclusion and interpretation.

13.3.23

(a) Test : There is no linear relationship between and . : There is a linear relationship between and . Reject if the -value . Test 2: : There is no linear relationship between and . : There is a linear relationship between and . Reject if the -value . Test 3: : There is no linear relationship between and . : There is a linear relationship between and . Reject if the -value . (b) Test , with . Test , with . Test , with . (c) Test 1: The , which is . Therefore we reject . There is evidence of a linear relationship between and . Test 2: The , which is . Therefore we reject . There is evidence of a linear relationship between and . Test 3: The , which is not . Therefore we do not reject . There is insufficient evidence of a linear relationship between and .

Question 13.126

24. Identify any predictors that have corresponding -values greater than the level of significance . Of these, discard the variable with the largest -value. Then redo Exercise 23, omitting this predictor. Repeat if necessary.

Question 13.127

25. Verify the regression assumptions for your final model from Exercise 24.

13.3.25

image

image

The scatterplot above of the residuals versus fitted values shows no strong evidence of unhealthy patterns. Thus, the independence assumption, the constant variance assumption, and the zero-mean assumption are verified. Also, the normal probability plot of the residuals above indicates no evidence of departure from normality of the residuals. Therefore we conclude that the regression assumptions are verified.

Question 13.128

26. Report and interpret your final model from Exercise 24, by doing the following:

  1. Provide the multiple regression equation for your final model.
  2. Interpret the multiple regression coefficients so that a nonstatistician could understand.
  3. Report and interpret the standard error of the estimate and the adjusted coefficient of determination .

Use the following data set for Exercises 27–33. Note that is a dummy variable.

−0.7 2 0.1 0
6.4 4 −2.5 1
2.8 6 2.7 0
9.4 8 2.8 1
8.6 10 −1.6 0
13.1 12 1.0 1
12.2 14 −1.4 0
19.1 16 −0.5 1
18.8 18 1.0 0
23.2 20 −2.3 1

Question 13.129

27. Perform the multiple regression of on , , and , and write the multiple regression equation.

13.3.27

The regression equation is .

Question 13.130

28. Assume the regression assumptions are met. Perform the test for the significance of the overall regression, using level of significance . Do the following:

  1. State the hypotheses and the rejection rule.
  2. Find the statistic and the -value.
  3. State the conclusion and interpretation.

Question 13.131

29. Interpret the coefficient for the dummy variable.

13.3.29

For each increase in one unit of the variable , the estimated value of increases by 3.55 units when the values of and are held constant.

Question 13.132

30. Perform the test for the significance of the individual predictor variables, using level of significance . Do the following:

  1. For each hypothesis test, state the hypotheses and the rejection rule.
  2. For each hypothesis test, find the statistic and the -value.
  3. For each hypothesis test, state the conclusion and interpretation.

Question 13.133

31. Identify any predictors that have corresponding -values greater than the level of significance . Of these, discard the variable with the largest -value. Then redo Exercise 30, omitting this predictor. Repeat if necessary.

13.3.31

The -value for is the only -value greater than , so we eliminate from the multiple regression equation. The new regression equation is . (a) Test : There is no linear relationship between and . : There is a linear relationship between and . Reject if the -value . Test There is no linear relationship between and . : There is a linear relationship between and . Reject if the -value . (b) Test 1: , with ; Test 2: , with . (c) Test 1: The , which is . Therefore we reject . There is evidence of a linear relationship between and . Test 2: The , which is . Therefore we reject . There is evidence of a linear relationship between and . Since all of the variables are significant, we have our final multiple regression equation.

Question 13.134

32. Verify the regression assumptions for your final model from Exercise 31.

Question 13.135

33. Report and interpret your final model from Exercise 31, by doing the following:

  1. Provide the multiple regression equation for your final model.
  2. Interpret the multiple regression coefficients so that a nonstatistician could understand.
  3. Report and interpret the standard error of the estimate , and the adjusted coefficient of determination .

13.3.33

(a) The final multiple regression equation is . For , the regression equation is . For , the regression equation is . (b) For each increase in one unit of the variable , the estimated value of increases by 1.15 units. The estimated increase in for those observations with , as compared to those with , when is held constant, is 3.61. (c) Using the multiple regression equation in (a), the size of the typical prediction error will be about 0.959129. 98.4% of the variability in is accounted for by this multiple regression equation.

APPLYING THE CONCEPTS

For Exercises 34–39, apply the Strategy for Building a Multiple Regression Model by performing the following steps, using level of significance :

  1. Step 1 Perform the test for significance of the overall regression.
  2. Step 2 Perform the tests for the individual predictors. If at least one of the predictors is not significant, then eliminate the variable with the largest -value from the model. Repeat Step 2 until all remaining predictors are significant.
  3. Step 3 Verify the assumptions.
  4. Step 4 Report and interpret your final model. Report and interpret the coefficients, the standard error of the estimate , and the adjusted coefficient of determination .

Question 13.136

bestdating

34. Best Places for Dating. Sperling's Best Places published the list of best places for dating in America for 2010. Table 6 shows the top 10 places, along with the overall dating score () and a set of predictor variables.

756

Table 13.42: TABLE 6 Best places for dating in America
City = Overall
dating
score
Percentage
18–24 years
old
Percentage
18–24 years
and single
Online
dating
score
Austin 100.0 13.40% 81.20% 77.8
Colorado
Springs
88.7 10.50% 74.20% 88.9
San Diego 84.0 11.30% 79.40% 77.4
Raleigh 80.7 11.60% 82.90% 79.2
Seattle 78.7 9.00% 83.90% 100.0
Charleston 78.7 11.20% 82.70% 66.9
Norfolk 77.0 11.20% 75.60% 82.9
Ann Arbor 75.5 12.90% 90.30% 51.1
Springfield 75.2 11.70% 89.80% 63.5
Honolulu 75.2 10.10% 82.30% 50.2

Question 13.137

bestbusiness

35. Ease of Doing Business. Doing Business (www.doingbusiness.org) publishes statistics on how easy or difficult different countries make it to do business. Table 7 shows the top 12 countries for ease of doing business, with .

Table 13.43: TABLE 7 Best countries for ease of doing business
Country Easiness
score
Starting a
business
Employing
workers
Paying
taxes
Singapore 100 10 1 5
New Zealand 99 1 14 12
United States 98 6 1 46
Hong Kong 97 15 20 3
Denmark 96 16 10 13
U.K. 95 8 28 16
Ireland 94 5 38 6
Canada 93 2 18 28
Australia 92 3 8 48
Norway 91 33 99 18
Iceland 90 17 62 32
Japan 89 64 17 112

13.3.35

See Solutions Manual.

Question 13.138

vaweather

36. Virginia Weather. Table 8 contains data on weather in a sample of cities in the state of Virginia. We are interested in predicting .

Table 13.44: TABLE 8 Data on the weather in Virginia
City Heating
degreedays
Avg.
Jan.
temp.
Avg.
July
temp.
Cooling
degreedays
Alexandria 4055 34.9 79.2 1531
Arlington 4055 34.9 79.2 1531
Blacksburg 5559 30.9 71.1 533
Charlottesville 4103 35.5 76.9 1212
Chesapeake 3368 40.1 79.1 1612
Danville 3970 36.6 78.8 1418
Hampton 3535 39.4 78.5 1432
Harrisonburg 5333 30.5 73.5 758
Leesburg 5031 31.5 75.2 911
Lynchburg 4354 34.5 75.1 1075
Manassas 4925 31.7 75.7 1075
Newport News 3179 41.2 80.3 1682
Norfolk 3368 40.1 79.1 1612
Petersburg 3334 39.7 79.6 1619
Portsmouth 3368 40.1 79.1 1612
Richmond 3919 36.4 77.9 1435
Roanoke 4284 35.8 76.2 1134
Suffolk 3467 39.6 78.5 1427
Virginia Beach 3336 40.7 78.8 1482
Table 13.44: Source: National Oceanic and Atmospheric Administration.

Question 13.139

healthinsurance

37. Health Insurance Coverage. We are interested in estimating , using and . Use the data in Table 9, containing a random sample of U.S. states. All data are in thousands.

Table 13.45: TABLE 9 Health insurance coverage
State Persons
covered
Adults not
covered
Children
not covered
Alabama 3,843 689 82
Arizona 4,958 1,311 283
Colorado 3,977 826 176
Georgia 7,688 1,659 314
Illinois 10,867 1,776 302
Kentucky 3,467 639 98
Maryland 4,836 776 137
Massachusetts 5,678 657 103
Michigan 8,928 1,043 116
Minnesota 4,675 475 104
Missouri 5,028 772 127
New Jersey 7,319 1,341 277
North Carolina 7,266 1,585 307
Ohio 10,181 1,138 157
Pennsylvania 11,108 1,237 203
South Carolina 3,553 672 112
Tennessee 5,111 809 94
Virginia 6,532 1,006 185
Washington 5,572 746 105
Wisconsin 4,995 481 63

13.3.37

See Solutions Manual.

757

Question 13.140

accounting

38. Regression in Accounting. We are interested in estimating using , , and . Use the data in Table 10, containing a random sample of large technology companies in 2010. Total assets and total liabilities are in billions of dollars.

Table 13.46: TABLE 10 Accounting data for large technology companies
Company Current
ratio
Price–earnings
ratio
Assets Liabilities
Microsoft 1.82 12.51 77.9 38.3
Intel 2.79 18.44 53.1 11.4
Dell 1.28 10.95 33.7 28.0
Apple 1.88 24.57 53.9 26.0
Google 10.62 18.87 40.5 4.5
Table 13.46: Source: Lexis-Nexis.
  1. Build the final multiple regression model using level of significance .
  2. Comment on your results from (a).
  3. Redo your work from (a), this time using level of significance .
  4. Report and interpret your final model from (c).

Question 13.141

systolic

39. Blood Pressure. Open the data set Systolic. We are interested in estimating , based on the other predictor variables.

13.3.39

See Solutions Manual.

Question 13.142

40. Baseball. In Example 16, interpret the coefficients for Triples, Hits, Home Runs, RBIs, Walks, and Red Sox.

Your Best Model. Work with the Nutrition data sets for Exercises 41 and 42.

Question 13.143

nutrition

41. Use technology to apply the Strategy for Building a Multiple Regression Model, using level of significance , for predicting the number of calories, with the following -variables: protein, fat, saturated fat, cholesterol, carbohydrates, calcium, phosphorous, iron, potassium, sodium, thiamin, niacin, and ascorbic acid.

13.3.41

image

The standard error in the estimate for the final model is . That is, using the multiple regression equation given above, the size of the typical prediction error will be about 16.7233 calories. The adjusted coefficient of variation is . In other words, 99.91% of the variation in calories is accounted for by this multiple regression equation.

Question 13.144

nutrition

42. Write a summary to interpret each regression coefficient, and comment on which variables are the most important for predicting the number of calories.