For Exercises 11.1 and 11.2, see page 533; for 11.3 and 11.4, see page 535; for 11.5 and 11.6, see page 537; for 11.7 and 11.8, see page 538; for 11.9 and 11.10, see page 541; for 11.11 to 11.14, see pages 543–544; and for 11.15 and 11.16, see page 544.
11.17 Describing a multiple regression.
As part of a study, data from 282 students majoring in accounting at the College of Business Studies in Kuwait were obtained through a survey.5 The researchers were interested in finding determinants of academic performance measured by the student’s major grade point average (MGPA). They considered gender, high school major, age, frequency of doing homework, participation in class, and number of days studying before an exam.
11.17
(a) Major grade point average. (b) . (c) . (d) Gender, high school major, age, frequency of doing homework, participation in class, and number of days studying before the exam.
11.18 Understanding the fitted regression line.
The fitted regression equation for a multiple regression is
11.19 Predicting the price of tablets: Individual variables.
Suppose your company needs to buy some tablets. To help in the purchasing decision, you decide to develop a model to predict the selling price. You decide to obtain price and product characteristic information on 20 tablets from Consumer Reports.6 The characteristics are screen size, battery life, weight (pounds), ease of use, display, and versatility. The latter three are scored on a 1 to 5 scale.
tablts
11.19
(a)
Variable | Mean | Median | Std Dev |
---|---|---|---|
Price | 395.50 | 400.00 | 119.76 |
Size | 9.15 | 9.90 | 1.21 |
Battery | 11.11 | 10.55 | 2.57 |
Weight | 1.07 | 1.10 | 0.29 |
Ease | 4.55 | 5.00 | 0.51 |
Display | 4.30 | 4.00 | 0.47 |
Versatility | 3.80 | 4.00 | 0.41 |
(c) Price is roughly Normal. Size has a bimodal distribution. Battery is right-skewed. Ease, Display and Versatility all only have 2 different values even though they were rated on a 1 to 5 scale. There aren’t really any unusual observations that might affect the regression analysis. (d) No, we do not make any assumption on the distribution of explanatory variables, so this is perfectly fine.
11.20 Predicting the price of tablets: Pairs of variables.
Refer to the tablet data described in Exercise 11.19.
tablts
11.21 Predicting the price of tablets: Multiple regression equation.
Refer to the tablet data described in Exercise 11.19.
tablts
11.21
(a)
. (b) . (c) A Normal quantile plot shows a potential outlier. (d)
.
A Normal quantile plot shows the residuals are much closer to a Normal distribution without the outlier; however, there still appears to be slightly heavy tails. This model is likely much better than the original model. Before only Size was significant, now Battery and Display are significant at the 5% level; the standard error is much smaller for the second model as well.
11.22 Predicting the price of a tablet.
Refer to the previous exercise. Let’s use the model with Observation 11 removed.
tablts
547
11.23 Data analysis: Individual variables.
Table 11.3 gives data on the current fast-food market share, along with the number of franchises, number of company-owned stores, annual sales ($ million) from three years ago, and whether it is a burger restaurant.7 Market share is expressed in percents, based on current U.S. sales.
ffood
11.23
(a)
Variable | Mean | Std Dev | Minimum | Lower Quartile | Median | Upper Quartile | Maximum |
---|---|---|---|---|---|---|---|
Share | 4.94 | 5.07 | 1.72 | 2.33 | 3.28 | 5.48 | 22.69 |
Franchises | 5525.56 | 5754.47 | 0.00 | 1983.00 | 4406.50 | 6563.00 | 23850.00 |
Company | 1116.31 | 1565.77 | 0.00 | 452.50 | 826.50 | 1194.50 | 6707.00 |
Sales | 6.99 | 7.23 | 1.80 | 3.20 | 5.05 | 7.95 | 32.40 |
Burger | 0.31 | 0.48 | 0.00 | 0.00 | 0.00 | 1.00 | 1.00 |
(c) McDonald’s is an outlier for Share and Sales, Subway is an outlier for Franchises, and Starbucks is an outlier for Company. Otherwise, it is hard to tell the distributions of the other restaurants because they are being squished on the histograms because of the outliers. Burger also only has two possible values.
11.24 Data analysis: Pairs of variables.
Refer to the previous exercise.
ffood
11.25 Multiple regression equation.
Refer to the fast-food data in Exercise 11.23. Run a multiple regression to predict market share using all four explanatory variables.
ffood
11.25
(a) . (b)
11.26 Residuals.
Refer to the fast-food data in Exercise 11.23. Find the residuals for the multiple regression used to predict market share based on the four explanatory variables.
ffood
Your analyses in Exercises 11.23 through 11.26 point to two restaurants, McDonald’s and Starbucks, as unusual in several respects. How influential are these restaurants? The following four exercises provide answers.
Restaurant | Market share | Franchises | Company | Sales | Burger |
---|---|---|---|---|---|
McDonald’s | 22.69 | 12,477 | 1550 | 32.4 | 1 |
Subway | 7.71 | 23,850 | 0 | 10.6 | 0 |
Starbucks | 6.76 | 4424 | 6707 | 7.6 | 0 |
Wendy’s | 5.48 | 5182 | 1394 | 8.3 | 1 |
Burger King | 5.48 | 6380 | 873 | 8.6 | 1 |
Taco Bell | 4.78 | 4389 | 1245 | 6.9 | 0 |
Dunkin’ Donuts | 4.02 | 6746 | 26 | 6.0 | 0 |
Pizza Hut | 3.63 | 7083 | 459 | 5.4 | 0 |
Chik-fil-A | 2.93 | 1461 | 76 | 3.6 | 0 |
KFC | 2.87 | 4275 | 780 | 4.7 | 0 |
Panera Bread | 2.49 | 791 | 662 | 3.1 | 0 |
Sonic | 2.42 | 3117 | 455 | 3.6 | 1 |
Domino’s | 2.23 | 4479 | 450 | 3.3 | 0 |
Jack in the Box | 1.98 | 1250 | 956 | 2.9 | 1 |
Arby’s | 1.91 | 2505 | 1144 | 3.0 | 0 |
Chipotle | 1.72 | 0 | 1084 | 1.8 | 0 |
548
11.27
Rerun Exercise 11.23 without the data for McDonald’s and Starbucks. Compare your results with what you obtained in that exercise.
ffood
11.27
(a)
Variable | Mean | Std Dev | Minimum | Lower Quartile | Median | Upper Quartile | Maximum |
---|---|---|---|---|---|---|---|
Share | 3.55 | 1.75 | 1.72 | 2.23 | 2.90 | 4.78 | 7.71 |
Franchises | 5107.71 | 5848.92 | 0.00 | 1461.00 | 4332.00 | 6380.00 | 23850.00 |
Company | 686.00 | 458.93 | 0.00 | 450.00 | 721.00 | 1084.00 | 1394.00 |
Sales | 5.13 | 2.62 | 1.80 | 3.10 | 4.15 | 6.90 | 10.60 |
Burger | 0.29 | 0.47 | 0.00 | 0.00 | 0.00 | 1.00 | 1.00 |
Taking out the two outliers fixed a lot of the outlier problems we saw earlier with the histograms. Subway still shows up as an outlier in the Franchise histogram; otherwise, we can now see the distributions of the other variables much better. (c) Share is somewhat right-skewed, but it has an outlier, Subway. Subway is also a huge outlier for Franchises, making it hard to tell the distribution of Franchises. Company is uniformly distributed. Sales looks roughly Normal with a small rightskew. Burger only has two possible values.
11.28
Rerun Exercise 11.24 without the data for McDonald’s and Starbucks. Compare your results with what you obtained in that exercise.
ffood
11.29
Rerun Exercise 11.25 without the data for McDonald’s and Starbucks. Compare your results with what you obtained in that exercise.
ffood
11.29
Taking out the two outliers changed the model somewhat and did give us less error overall. (a)
. (b) .
11.30
Rerun Exercise 11.26 without the data for McDonald’s and Starbucks. Compare your results with what you obtained in that exercise.
ffood
11.31 Predicting retail sales.
Daily sales at a secondhand shop are recorded over a 25-day period.8 The daily gross sales and total number of items sold are broken down into items paid by check, cash, and credit card. The owners expect that the daily numbers of cash items, check items, and credit card items sold will accurately predict gross sales.
retail
11.31
(a) All four variables are somewhat right-skewed. There is a potential outlier for gross sales. (b) All three explanatory variables look linearly related with gross sales but each scatterplot has a few semi-outlying observations that could be potentially influential. From the correlation matrix, we can see that both cash items and check items have quite strong linear relationships with gross sales, but they also have some correlation between them. (c) . (d) The Normal quantile plot shows a roughly Normal distribution with no outliers. The three residual plots all look pretty good (random) but show a couple semi-outlying observations we identified earlier. (e) The intercept is not significantly different from 0; .
11.32 Architectural firm billings.
A summary of firms engaged in commercial architecture in the Indianapolis, Indiana, area provides firm characteristics, including total annual billing in the current year, total annual billing in the previous year, the number of architects, the number of engineers, and the number of staff employed in the firm.9 Consider developing a model to predict current total billing using the other four variables.
arch