Chapter 7 Review Exercises

411

Question 7.96

7.96 LSAT scores.

The scores of four classmates on the Law School Admission Test are

166 129 148 153

Find the mean, the standard deviation, and the standard error of the mean. Is it appropriate to calculate a confidence interval based on these data? Explain why or why not.

Question 7.97

7.97 is robust.

A manufacturer of flash drives employs a market research firm to estimate retail sales of its products. Here are last month’s sales of 64GB flash drives from an SRS of 50 stores in the Midwest sales region:

retail

29 31 45 40 32 21 23 28 19 11
35 21 17 23 22 22 33 31 34 15
32 27 33 24 21 28 16 67 21 39
33 56 48 14 40 8 47 21 21 25
53 28 35 16 20 24 45 56 28 23
  1. Make a stemplot of the data to confirm that the distribution is skewed to the right. Even though the data are not Normal, explain why the procedures can be used to analyze these data.
  2. Let’s verify this robustness. Three bootstrap (pages 372373) simulations, each with 1000 repetitions, give these 95% confidence intervals for mean sales in the entire region: (26.32, 33.10), (26.14, 33.22), and (26.46, 33.20). Find the 95% confidence interval for the mean. Is it essentially the same as the bootstrap intervals? Explain your answer.

7.97

(a) Because , we can use procedures for skewed data. (b) (26.06, 33.18). Yes, the interval is essentially the same as the bootstrap intervals.

Question 7.98

7.98 Number of critical food violations.

The results of a major city’s restaurant inspections are available through its online newspaper.34 Critical food violations are those that put patrons at risk of getting sick and must be immediately corrected by the restaurant. An SRS of inspections from the more than 10,000 inspections since January 2012 had violations and violations.

  1. Test the hypothesis, using , that the average number of critical violations is less than 1.25. State the two hypotheses, the test statistic, and the -value.
  2. Construct a 95% confidence interval for the average number of critical violations and summarize your result.
  3. Which of the two summaries (significance test versus confidence interval) do you find more helpful in this case? Explain your answer.
  4. These data are integers ranging from 0 to 14. The data are also skewed to the right, with 70% of the values either a 0 or 1. Given this information, do you feel use of the procedures is appropriate? Explain your answer.

Question 7.99

7.99 Interpreting software output.

You use statistical software to perform a significance test of the null hypothesis that two means are equal. The software reports -values for the two-sided alternative. Your alternative is that the first mean is less than the second mean.

  1. The software reports with a -value of 0.07. Would you reject with ? Explain your answer.
  2. The software reports with a -value of 0.07. Would you reject with ? Explain your answer.

7.99

(a) ; we reject . (b) ; we fail to reject .

Question 7.100

7.100 The wine makes the meal?

In a recent study, 39 diners were given a free glass of Cabernet Sauvignon to accompany a French meal.35 Although the wine was identical, half the bottle labels claimed the wine was from California, and the other half claimed it was from North Dakota. The following table summarizes the grams of entrée and wine consumed during the meal.

Wine label
Entrée California 24 499.8 87.2
North Dakota 15 439.0 89.2
Wine California 24 100.8 23.3
North Dakota 15 110.4 9.0

Did the patrons who thought the wine was from California consume more? Analyze the data and write a report summarizing your work. Be sure to include details regarding the statistical methods you used, your assumptions, and your conclusions.

Question 7.101

7.101 Study design information.

In the previous study, diners were seated alone or in groups of two, three, four, and, in one case, nine (for a total of tables). Also, each table, not each patron, was randomly assigned a particular wine label. Does this information alter how you might perform the analysis in the previous exercise? Explain your answer.

7.101

How many people are at the table could certainly affect how much you consume. Any analysis should take into account the table size; because the previous exercise did not take this into account, the analysis is likely invalid.

Question 7.102

7.102 Which design?

The following situations all require inference about a mean or means. Identify each as (1) a single sample, (2) matched pairs, or (3) two independent samples. Explain your answers.

  1. Your customers are college students. You are interested in comparing the interest in a new product that you are developing between those students who live in the dorms and those who live elsewhere.
  2. Your customers are college students. You are interested in finding out which of two new product labels is more appealing.
  3. Your customers are college students. You are interested in assessing their interest in a new product.

412

Question 7.103

7.103 Which design?

The following situations all require inference about a mean or means. Identify each as (1) a single sample, (2) matched pairs, or (3) two independent samples. Explain your answers.

  1. You want to estimate the average age of your store’s customers.
  2. You do an SRS survey of your customers every year. One of the questions on the survey asks about customer satisfaction on a 7-point scale with the response 1 indicating “very dissatisfied’’ and 7 indicating “very satisfied.’’ You want to see if the mean customer satisfaction has improved from last year.
  3. You ask an SRS of customers their opinions on each of two new floor plans for your store.

7.103

(a) A single sample. (b) Two independent samples. (c) Matched pairs.

Question 7.104

7.104 Two-sample test versus matched pairs test.

Consider the following data set. The data were actually collected in pairs, and each row represents a pair.

paired

Group 1 Group 2
48.86 48.88
50.60 52.63
51.02 52.55
47.99 50.94
54.20 53.02
50.66 50.66
45.91 47.78
48.79 48.44
47.76 48.92
51.13 51.63
  1. Suppose that we ignore the fact that the data were collected in pairs and mistakenly treat this as a two-sample problem. Compute the sample mean and variance for each group. Then compute the two-sample statistic, degrees of freedom, and -value for the two-sided alternative.
  2. Now analyze the data in the proper way. Compute the sample mean and variance of the differences. Then compute the statistic, degrees of freedom, and -value.
  3. Describe the differences in the two test results.

Question 7.105

7.105 Two-sample test versus matched pairs test, continued.

paired

Refer to the previous exercise. Perhaps an easier way to see the major difference in the two analysis approaches for these data is by computing 95% confidence intervals for the mean difference.

  1. Compute the 95% confidence interval using the two-sample confidence interval.
  2. Compute the 95% confidence interval using the matched pairs confidence interval.
  3. Compare the estimates (that is, the centers of the intervals) and margins of error. What is the major difference between the two approaches for these data?

7.105

(a) (−2.859, 1.153). (b) (−1.761, 0.055). (c) The estimates (centers) are the same, but the margin of error for the two-sample procedure is much larger than for the matched pairs procedure.

Question 7.106

7.106 Average service time.

Recall the drive-thru study in Exercise 7.57 (page 395). Another benchmark that was measured was the service time. A summary of the results (in seconds) for two of the chains is shown here.

Chain
Taco Bell 308 158.03 35.7
McDonald’s 317 189.49 42.8
  1. Is there a difference in the average service time between these two chains? Test the null hypothesis that the chains’ average service time is the same. Use a significance level of 0.05.
  2. Construct a 95% confidence interval for the difference in average service time.
  3. Lex plans to go to Taco Bell and Sam to McDonald’s. Is it true that there is a 95% chance that the interval in part (b) contains the difference in their service times? Explain your answer.

Question 7.107

7.107 Average number of cars in the drive-thru lane.

Refer to the previous exercise. A related benchmark measure was the number of cars observed in the drive-thru lane. A summary for the same two chains is shown here.

Chain
Taco Bell 308 2.11 2.83
McDonald’s 317 3.81 4.56
  1. Is there a difference in the average number of cars in the drive-thru lane? Test the null hypothesis that the chains’ average number of cars is the same. Use a significance level of 0.05.
  2. These data can only take the values 0, 1, 2, . . . , so they are definitely not Normal. The standard deviations are also much larger than the means, suggesting strong skewness. Does this imply the analysis in part (a) is not reasonable? Explain your answer.

7.107

(a) . The data are significant at the 5% level, and there is evidence the average number of cars in the drive-thru lane between the two chains is different. (b) Because , the procedures can be used for skewed data, as long as there are no outliers.

413

Question 7.108

7.108 Does dress affect competence and intelligence ratings?

Researchers performed a study to examine whether or not women are perceived as less competent and less intelligent when they dress in a sexy manner versus a business-like manner. Competence was rated from 1 (not at all) to 7 (extremely), and a 1 to 5 scale was used for intelligence. Under each condition, 17 subjects provided data. Here are summary statistics:36

Sexy Business-like
Rating
Competence 4.13 0.99 5.42 0.85
Intelligence 2.91 0.74 3.50 0.71

Analyze the two variables, and write a report summarizing your work. Be sure to include details regarding the statistical methods you used, your assumptions, and your conclusions.

Question 7.109

7.109 Can snobby salespeople boost retail sales?

Researchers asked 180 women to read a hypothetical shopping experience where they entered a luxury store (for example, Louis Vuitton, Gucci, Burberry) and ask a salesperson for directions to the items they seek. For half the women, the salesperson was condescending while doing this. The other half were directed in a neutral manner. After reading the experience, participants were asked various questions, including what price they were willing to pay (in dollars) for a particular product from the brand.37 Here is a summary of the results.

Chain
Condescending 90 4.44 3.98
Neutral 90 3.95 2.88

Were the participants who were treated rudely willing to pay more for the product? Analyze the data, and write a report summarizing your work. Be sure to include details regarding the statistical methods you used, your assumptions, and your conclusions.

7.109

. There is no significant evidence that those treated rudely were willing to pay more. This is a two-sample one-sided significance test; we assumed the data were randomly selected and that the data contain no outliers.

Question 7.110

7.110 Evaluate the dress study.

Refer to Exercise 7.108. Participants in the study viewed a videotape of a woman described as a 28-year-old senior manager for a Chicago advertising firm who had been working for this firm for seven years. The same woman was used for each of the two conditions, but she wore different clothing each time. For the business-like condition, the woman wore little makeup, black slacks, a turtleneck, a business jacket, and flat shoes. For the sexy condition, the same woman wore a tight knee-length skirt, a low-cut shirt with a cardigan over it, high-heeled shoes, and more makeup, and her hair was tousled. The subjects who evaluated the videotape were male and female undergraduate students who were predominantly Caucasian, from middle- to upper-class backgrounds, and between the ages of 18 and 24. The content of the videotape was identical in both conditions. The woman described her general background, life in college, and hobbies.

  1. Write a critique of this study, with particular emphasis on its limitations and how you would take these into account when drawing conclusions based on the study.
  2. Propose an alternative study that would address a similar question. Be sure to provide details about how your study would be run.

Question 7.111

7.111 More on snobby salespeople.

Refer to Exercise 7.109. Researchers also asked a different 180 women to read the same hypothetical shopping experience but now they entered a mass market (e.g., Gap, American Eagle, H&M). Here are those results (in dollars) for the two conditions.

Chain
Condescending 90 2.90 3.28
Neutral 90 2.98 3.24

Were the participants who were treated rudely willing to pay more for the product? Analyze the data, and write a report summarizing your work. Be sure to include details regarding the statistical methods you used, your assumptions, and your conclusions. Also compare these results with the ones from Exercise 7.109.

7.111

. There is no significant evidence that those treated rudely were willing to pay more. This is a two-sample one-sided significance test; we assumed the data were randomly selected and that the data contain no outliers. We found similar results here as in Exercise 7.109, where condescending did not boost sales.

Question 7.112

7.112 Transforming the response.

Refer to Exercises 7.109 and 7.111. The researchers state that they took the natural log of the willingness to pay variable in order to “normalize the distribution’’ prior to analysis. Thus, their test results are based on log dollar measurements. For the procedures used in the previous two exercises, do you feel this transformation is necessary? Explain your answer.

Question 7.113

7.113 Personalities of hotel managers.

Successful hotel managers must have personality characteristics often thought of as feminine (such as “compassionate’’) as well as those often thought of as masculine (such as “forceful’’). The Bem Sex-Role Inventory (BSRI) is a personality test that gives separate ratings for female and male stereotypes, both on a scale of 1 to 7. Here are summary statistics for a sample of 148 male general managers of three-star and four-star hotels.38 The data come from a comprehensive mailing to these hotels. The response rate was 48%, which is good for mail surveys of this kind. Although nonresponse remains an issue, users of statistics usually act as if they have an SRS when the response rate is “good enough.’’

414

Masculinity score Femininity score

The mean BSRI masculinity score for the general male population is . Is there evidence that hotel managers on the average score higher in masculinity than the general male population?

7.113

. There is evidence that the hotel managers on average score higher in masculinity than the general male population.

Question 7.114

7.114 Another personality trait of hotel managers.

Continue your study from the previous exercise. The mean BSRI femininity score in the general male population is . (It does seem odd that the mean femininity score is higher than the mean masculinity score, but such is the world of personality tests. The two scales are separate.) Is there evidence that hotel managers on the average score higher in femininity than the general male population?

Question 7.115

7.115 Alcohol content of wine.

The alcohol content of wine depends on the grape variety, the way in which the wine is produced from the grapes, the weather, and other influences. Here are data on the percent of alcohol in wine produced from the same grape variety in the same year by 48 winemakers in the same region of Italy:39

wine

12.86 12.88 12.81 12.70 12.51
12.60 12.25 12.53 13.49 12.84
12.93 13.36 13.52 13.62 12.25
13.16 13.88 12.87 13.32 13.08
13.50 12.79 13.11 13.23 12.58
13.17 13.84 12.45 14.34 13.48
12.36 13.69 12.85 12.96 13.78
13.73 13.45 12.82 13.58 13.40
12.20 12.77 14.16 13.71 13.40
13.27 13.17 14.13
  1. Make a stemplot of the data. The distribution is a bit irregular, but there is no reason to avoid use of procedures for .
  2. Give a 95% confidence interval for the mean alcohol content of wine of this type.

7.115

(a) There are no outliers, so procedures can be used for . (b) (13.00, 13.31).

Question 7.116

7.116 Gender-based expectations?

A summary of U.S. hurricanes over the last six decades show that feminine-named hurricanes have resulted in significantly more deaths than masculine-named hurricanes.40 Why is this? One group of researchers propose this is due to gender-based expectations of severity, which in turn leads to unpreparedness and lack of protective action. To demonstrate this, the researchers used five male and five females hurricane names and asked 346 participants to predict each hurricane’s intensity and strength on a 7-point scale. The data file NAMES contains the average rankings of severity for 50 participants. Is there evidence that there is a gender-based difference in severity? Write a report summarizing your work.

names

Question 7.117

7.117 The manufacture of dyed clothing fabrics.

Different fabrics respond differently when dyed. This matters to clothing manufacturers, who want the color of the fabric to be just right. Fabrics made of cotton and of ramie are dyed with the same “procion blue’’ dye applied in the same way. A colorimeter is used to measure the lightness of the color on a scale in which black is 0 and white is 100. Here are the data for eight pieces of each fabric:41

dyeclr

Cotton 48.82 48.88 48.98 49.04
48.68 49.34 48.75 49.12
Ramie 41.72 41.83 42.05 41.44
41.27 42.27 41.12 41.49

Which fabric is darker when dyed in this way? Write an answer to this question that includes summary statistics and a test of significance.

7.117

For Cotton: . For Ramie: There is significant evidence that lightness of color is different for the different fabrics.

Question 7.118

7.118 Durable press and breaking strength.

“Durable press’’ cotton fabrics are treated to improve their recovery from wrinkles after washing. Unfortunately, the treatment also reduces the strength of the fabric. A study compared the breaking strength of fabric treated by two commercial durable press processes. Five specimens of the same fabric were assigned at random to each process. Here are the data, in pounds of pull needed to tear the fabric:42

brkstr

Permafresh 55 29.9 30.7 30.0 29.5 27.6
Hylite LF 28.8 23.9 27.0 22.1 24.2

Is there good evidence that the two processes result in different mean breaking strengths?

Question 7.119

7.119 Find a confidence interval.

Continue your work from the previous exercise. A fabric manufacturer wants to know how large a strength advantage fabrics treated by the Permafresh method have over fabrics treated by the Hylite process. Give a 95% confidence interval for the difference in mean breaking strengths.

brkstr

7.119

(0.723, 7.957).

415

Question 7.120

7.120 Recovery from wrinkles.

Of course, the reason for durable press treatment is to reduce wrinkling. “Wrinkle recovery angle’’ measures how well a fabric recovers from wrinkles. Higher is better. Here are data on the wrinkle recovery angle (in degrees) for the same fabric specimens discussed in the previous two exercises:

wrinkle

Permafresh 55 136 135 132 137 134
Hylite LF 143 141 146 141 145

Which process has better wrinkle resistance? Is the difference statistically significant?

Question 7.121

7.121 Competitive prices?

A retailer entered into an exclusive agreement with a supplier who guaranteed to provide all products at competitive prices. The retailer eventually began to purchase supplies from other vendors who offered better prices. The original supplier filed a legal action claiming violation of the agreement. In defense, the retailer had an audit performed on a random sample of invoices. For each audited invoice, all purchases made from other suppliers were examined, and the prices were compared with those offered by the original supplier. For each invoice, the percent of purchases for which the alternate supplier offered a lower price than the original supplier was recorded. Here are the data:43

cmppric

100 0 0 100 33 45 100 34 78
100 77 33 100 69 100 89 100 100
100 100 100 100 100 100 100

Report the average of the percents with a 95% margin of error. Do the sample invoices suggest that the original supplier’s prices are not competitive on the average?

7.121

(64.55, 92.09). The average percent of purchases for which the alternate supplier offered a lower price is 64.55% and 92.09%. So, a large percent of the time, the alternate vendor gave better prices, meaning the original supplier’s prices are not competitive.

Question 7.122

7.122 Brain training.

The assessment of computerized brain-training programs is a rapidly growing area of research. Researchers are now focusing on who this training benefits most, what brain functions are most susceptible to improvement, and which products are most effective. A recent study looked at 487 community-dwelling adults aged 65 and older, each randomly assigned to one of two training groups. In one group, the participants used a computerized program one hour per day. In the other, DVD-based educational programs were shown and quizzes were administered after each video. The training period lasted eight weeks. The response was the improvement in a composite score obtained from an auditory memory/attention survey given before and after the eight weeks.44 The results are summarized here.

Group
Computer program 242 3.9 8.28
DVD program 245 1.8 8.33
  1. Given that other studies show a benefit of computerized brain training, state the null and alternative hypotheses.
  2. Report the test statistic, its degrees of freedom, and the -value. What is your conclusion using significance level ?
  3. Can you conclude that this computerized brain training always improves a person’s auditory memory/ perception better than the DVD program? If not, explain why.

Question 7.123

7.123 Sign test for time using smartphone.

CASE 7.1 Example 7.1 (page 361) gives data on the daily number of minutes eight students at your institution use their smartphones. Is there evidence that the median amount of minutes is less than 120 minutes (2 hours)? State the hypotheses, carry out the sign test, and report your conclusion.

smrtphn

7.123

. The data are not significant at the 5% level, and there is not enough evidence that the median is less than 120.

Question 7.124

7.124 Investigating the endowment effect, continued.

Refer to Exercise 7.26 (page 376). The group of researchers also asked these same 40 students from a graduate marketing course to consider a Vosges Oaxaca gourmet chocolate bar made with dark chocolate and chili pepper. Test the null hypothesis that there is no difference between the two prices. Also construct a 95% confidence interval of the endowment effect.

endow1

Question 7.125

7.125 Testing job applicants.

The one-hole test is used to test the manipulative skill of job applicants. This test requires subjects to grasp a pin, move it to a hole, insert it, and return for another pin. The score on the test is the number of pins inserted in a fixed time interval. One study compared male college students with experienced female industrial workers. Here are the data for the first minute of the test:45

Group
Students 750 35.12 4.31
Workers 412 37.32 3.83
  1. We expect that the experienced workers will outperform the students, at least during the first minute, before learning occurs. State the hypotheses for a statistical test of this expectation and perform the test. Give a -value, and state your conclusions.
  2. The distribution of scores is slightly skewed to the left. Explain why the procedure you used in part (a) is nonetheless acceptable.
  3. One purpose of the study was to develop performance norms for job applicants. Based on the preceding data, what is the range that covers the middle 95% of experienced workers? (Be careful! This is not the same as a 95% confidence interval for the mean score of experienced workers.)
  4. The five-number summary of the distribution of scores among the workers is

    23 33.5 37 40.5 46

    for the first minute and

    32 39 44 49 59

    for the fifteenth minute of the test. Display these summaries graphically, and describe briefly the differences between the distributions of scores in the first and fifteenth minutes.

416

7.125

(a) . There is significant evidence that the experienced workers will outperform the students during the first minute. (b) Because , the procedures can be used for skewed data. (c) 29.66 and 44.98. (d) The scores for the 1st minute are much lower than the scores for the 15th minute.

Question 7.126

7.126 Ego strengths of MBA graduates: power.

In Exercise 7.93 (page 410), you found the power for a study designed to compare the “ego strengths’’ of two groups of MBA students. Now you must design a study to compare MBA graduates who reached partner in a large consulting firm with those who joined the firm but failed to become partners.

Assume the same value of and use . You are planning to have 20 subjects in each group. Calculate the power of the pooled two-sample test that compares the mean ego strengths of these two groups of MBA graduates for several values of the true difference. Include values that have a very small chance of being detected and some that are virtually certain to be seen in your sample. Plot the power versus the true difference and write a short summary of what you have found.

Question 7.127

7.127 Sign test for the endowment effects.

Refer to Exercise 7.26 (page 376) and Exercise 7.124. We can also compare the endowment effects of each chocolate bar. Is there evidence that the median difference in endowment effects (Woolloomooloo minus Oaxaca) is greater than 0? Perform a sign test using the 0.05 significance level.

endow2

7.127

. The probability is 0.095. There is no strong evidence that the two endowment effects are different.