Section 7.2 Exercises

For Exercises 7.40 and 7.41, see page 382; for 7.42, see page 383; for 7.43, see page 384; and for 7.44 and 7.45, see page 392.

In exercises that call for two-sample procedures, you may use either of the two approximations for the degrees of freedom that we have discussed: the value given by your software or the smaller of and . Be sure to state clearly which approximation you have used.

Question 7.46

7.46 What’s wrong?

In each of the following situations, explain what is wrong and why.

  1. A researcher wants to test versus the two-sided alternative .
  2. A study recorded the credit card IQ scores of 100 college freshmen. The scores of the 48 males in the study were compared with the scores of all 100 freshmen using the two-sample methods of this section.
  3. A two-sample statistic gave a -value of 0.97. From this, we can reject the null hypothesis with 95% confidence.
  4. A researcher is interested in testing the one-sided alternative . The significance test for gave . With a -value for the two-sided alternative of 0.024, he concluded that his -value was 0.012.

Question 7.47

7.47 Understanding concepts.

For each of the following, answer the question and give a short explanation of your reasoning.

  1. A 95% confidence interval for the difference between two means is reported as (0.3, 0.7). What can you conclude about the results of a level significance test of the null hypothesis that the population means are equal versus the two-sided alternative?
  2. Will larger samples generally give a larger or smaller margin of error for the difference between two sample means?

7.47

(a) Because 0 is not in the interval, we can reject the null hypothesis; the data support a significant difference between the two means. (b) Generally, a larger sample will result in a smaller margin of error.

Question 7.48

7.48 Determining significance.

For each of the following, answer the question and give a short explanation of your reasoning.

  1. A significance test for comparing two means gave with 11 degrees of freedom. Can you reject the null hypothesis that the ’s are equal versus the two-sided alternative at the 5% significance level?
  2. Answer part (a) for the one-sided alternative that the difference in means is negative.
  3. Answer part (a) for the one-sided alternative that the difference in means is positive.

Question 7.49

7.49 Advertising in sports.

Can there ever be too many commercials during a sporting event? A group of researchers compared the level of acceptance for commercials between NASCAR and NFL fans.24 Each fan was asked a series of 5-point Likert scale questions to evaluate their level of commercial acceptance. The average of these questions was used as the response, where a lower score means less acceptance. Here are the results:

394

Group
NASCAR 300 3.42 0.84
NFL 302 3.27 0.81
  1. Is it appropriate to use the two-sample procedures that we studied in this section to analyze these data for group differences? Give reasons for your answer.
  2. Describe appropriate null and alternative hypotheses for comparing NASCAR and NFL average commercial acceptance levels.
  3. Carry out the significance test using . Report the test statistic with the degrees of freedom and the -value. Write a short summary of your conclusion.

7.49

(a) Yes, because outliers are not possible and , the procedures can be used. (b) (c) . The data are significant at the 5% level, and there is evidence of a difference between NASCAR and NFL average commercial acceptance levels.

Question 7.50

7.50 Advertising in sports, continued.

Refer to the previous exercise. This study not only allows a comparison of these two fan groups, but also an assessment of each fan group separately. Write a short paragraph summarizing the key results an advertiser should take away from this study.

Question 7.51

7.51 Trustworthiness and eye color.

Why do we naturally tend to trust some strangers more than others? One group of researchers decided to study the relationship between eye color and trustworthiness.25 In their experiment, the researchers took photographs of 80 students (20 males with brown eyes, 20 males with blue eyes, 20 females with brown eyes, and 20 females with blue eyes), each seated in front of a white background looking directly at the camera with a neutral expression. These photos were cropped so the eyes were horizontal and at the same height in the photo and so the neckline was visible. They then recruited 105 participants to judge the trustworthiness of each student photo. This was done using a 10-point scale, where 1 meant very untrustworthy and 10 very trustworthy. The 80 scores from each participant were then converted to -scores, and the average -score of each photo (across all 105 participants) was used for the analysis. Here is a summary of the results:

Eye color
Brown 40 0.55 1.68
Blue 40 −0.38 1.53

Can we conclude from these data that brown-eyed students appear more trustworthy compared with their blue-eyed counterparts? Test the hypothesis that the average scores for the two groups are the same.

7.51

. (c) . The data show that brown-eyed students appear more trustworthy compared with their blue-eyed counterparts.

Question 7.52

7.52 Sadness and spending.

The “misery is not miserly” phenomenon refers to a sad person’s spending judgment going haywire. In a recent study, 31 young adults were given $10 and randomly assigned to either a sad or a neutral group. The participants in the sad group watched a video about the death of a boy’s mentor (from The Champ), and those in the neutral group watched a video on the Great Barrier Reef. After the video, each participant was offered the chance to trade $0.50 increments of the $10 for an insulated water bottle.26 Here are the data:

sadness

Group Purchase price ($)
Neutral 0.00 2.00 0.00 1.00 0.50 0.00 0.50
2.00 1.00 0.00 0.00 0.00 0.00 1.00
Sad 3.00 4.00 0.50 1.00 2.50 2.00 1.50 0.00 1.00
1.50 1.50 2.50 4.00 3.00 3.50 1.00 3.50
  1. Examine each group’s prices graphically. Is use of the procedures appropriate for these data? Carefully explain your answer.
  2. Make a table with the sample size, mean, and standard deviation for each of the two groups.
  3. State appropriate null and alternative hypotheses for comparing these two groups.
  4. Perform the significance test at the level, making sure to report the test statistic, degrees of freedom, and -value. What is your conclusion?
  5. Construct a 95% confidence interval for the mean difference in purchase price between the two groups.

Question 7.53

7.53 Noise levels in fitness classes.

Fitness classes often have very loud music that could affect hearing. One study collected noise levels (decibels) in both high-intensity and low-intensity fitness classes across eight commercial gyms in Sydney, Australia.27

noise

  1. Create a histogram or Normal quantile plot for the high-intensity classes. Do the same for the low-intensity classes. Are the distributions reasonably Normal? Summarize the distributions in words.
  2. Test the equality of means using a two-sided alternative hypothesis and significance level .
  3. Are the procedures appropriate given your observations in part (a)? Explain your answer.
  4. Remove the one low decibel reading for the low-intensity group and redo the significance test. How does this outlier affect the results?
  5. Do you think the results of the significance test from part (b) or (d) should be reported? Explain your answer.

7.53

(a) Both distributions are Normally distributed, except the low-intensity class has a low outlier.
(b) . The data are significant at the 5% level, and there is evidence the noise levels are different between the high- and low-intensity fitness classes. (c) Because the low-intensity class has an outlier, the -test is not appropriate. (d) . Removing the outlier didn’t change the results. (e) Because the outlier is not affecting the results, it is probably okay to report both tests. It would be a good idea to investigate the outlier and see why it had such a low decibel value; if it were drastically different in some way, it might be good to remove it and only report the test without it after mentioning its removal.

395

Question 7.54

7.54 Noise levels in fitness classes, continued.

Refer to the previous exercise. In most countries, the workplace noise standard is 85 db (over eight hours). For every 3 dB increase above that, the amount of exposure time is halved. This means that the exposure time for a dB level of 91 is two hours, and for a dB level of 94 it is one hour.

noise

  1. Construct a 95% confidence interval for the mean dB level in high-intensity classes.
  2. Using the interval in part (a), construct a 95% confidence interval for the number of one-hour classes per day an instructor can teach before possibly risking hearing loss. (Hint: This is a linear transformation.)
  3. Repeat parts (a) and (b) for low-intensity classes.
  4. Explain how one might use these intervals to determine the staff size of a new gym.

Question 7.55

7.55 Counts of seeds in one-pound scoops.

Refer to Exercise 7.23 (pages 375376). As part of the Six Sigma quality improvement effort, the company wants to compare scoops of seeds from two different packaging plants. An SRS of 50 one-pound scoops of seeds was collected from Plant 1746, and an SRS of 19 one-pound scoops of seeds was collected from Plant 1748. The number of seeds in each scoop were recorded.

seedcnt2

  1. Using this data set, create a histogram, boxplot, and Normal quantile plot of the seed counts from Plant 1746. Do the same for Plant 1748. Are the distributions reasonably Normal? Summarize the distributions in words.
  2. Are the procedures appropriate given your observations in part (a)? Explain your answer.
  3. Compare the mean number of seeds per one-pound scoop for these two manufacturing plants using a 99% confidence interval.
  4. Test the equality of the means using a two-sided alternative and a significance level of 1%. Make sure to specify the test statistic, degrees of freedom, and -value.
  5. Write a brief summary of your procedures assuming your audience is the company CEO and the two plant managers.

7.55

(a) For plant 1746: the data are roughly Normal. For plant 1748: the data are somewhat left-skewed but have several clusters or groups of points. (b) Because the total , the procedures are appropriate. (c) For 1746: . For 1748: . Using , the 99% C.I. is (−418.4, −76.8). (d) . The data are significant at the 1% level, and there is evidence that the mean number of seeds per 1-pound scoop is different for the two plants. (e) Answers will vary. The emphasis should be on the difference between the number of seeds so that potentially the scoops from plant 1746 are too light or the scoops from plant 1748 are too heavy (assuming the seeds are the same size/weight).

Question 7.56

7.56 More on counts of seeds.

Refer to the previous exercise.

  1. When would a one-sided alternative hypothesis be appropriate in this setting? Explain.
  2. What alternative hypothesis would we be testing if we halved the -value from the previous exercise?

Question 7.57

7.57 Drive-thru customer service.

QSRMagazine.com assessed 1855 drive-thru visits at quickservice restaurants.28 One benchmark assessed was customer service. Responses ranged from “Rude (1)” to “Very Friendly (5).” The following table breaks down the responses according to two of the chains studied.

drvthru

Rating
Chain 1 2 3 4 5
Taco Bell 0 5 41 143 119
McDonald’s 1 22 55 139 100
  1. A researcher decides to compare the average rating of McDonald’s and Taco Bell. Comment on the appropriateness of using the average rating for these data.
  2. Assuming an average of these ratings makes sense, comment on the use of the procedures for these data.
  3. Report the means and standard deviations of the ratings for each chain separately.
  4. Test whether the two chains, on average, have the same customer satisfaction. Use a two-sided alternative hypothesis and a significance level of 5%.

7.57

(a) The problem with averages on rating is that there is no guarantee the differences between ratings are equal, so that going from a rating of 1 to 2, and 2 to 3, etc., are equal. Taking averages assumes this so it is likely not appropriate. (b) The data are ratings from 1–5; as such they certainly will not be Normally distributed but because and outliers are not possible, the procedures can be used. (c) McDonald’s: . Taco Bell: . (d) . The data are significant at the 5% level, and there is evidence the average customer ratings between the two chains is different.

Question 7.58

7.58 Dust exposure at work.

Exposure to dust at work can lead to lung disease later in life. One study measured the workplace exposure of tunnel construction workers.29 Part of the study compared 115 drill and blast workers with 220 outdoor concrete workers. Total dust exposure was measured in milligram years per cubic meter (). The mean exposure for the drill and blast workers was with a standard deviation of . For the outdoor concrete workers, the corresponding values were 6.5 and , respectively.

  1. The sample included all workers for a tunnel construction company who received medical examinations as part of routine health checkups. Discuss the extent to which you think these results apply to other similar types of workers.
  2. Use a 95% confidence interval to describe the difference in the exposures. Write a sentence that gives the interval and provides the meaning of 95% confidence.
  3. Test the null hypothesis that the exposures for these two types of workers are the same. Justify your choice of a one-sided or two-sided alternative. Report the test statistic, the degrees of freedom, and the -value. Give a short summary of your conclusion.
  4. The authors of the article describing these results note that the distributions are somewhat skewed. Do you think that this fact makes your analysis invalid? Give reasons for your answer.

396

Question 7.59

7.59 Not all dust is the same.

Not all dust particles that are in the air around us cause problems for our lungs. Some particles are too large and stick to other areas of our body before they can get to our lungs. Others are so small that we can breathe them in and out and they will not deposit in our lungs. The researchers in the study described in the previous exercise also measured respirable dust. This is dust that deposits in our lungs when we breathe it. For the drill and blast workers, the mean exposure to respirable dust was with a standard deviation of . The corresponding values for the outdoor concrete workers were 1.4 and , respectively. Analyze these data using the questions in the previous exercise as a guide.

7.59

(a) Answers will vary. But there are likely differences about this company’s workers that could not be generalized to other workers. (b) (4.37, 5.43). With 95% confidence, the drill and blast workers have between 4.37 and 5.43 more exposure to respirable dust than the outdoor concrete workers. (c) . There is significant evidence that the drill and blast workers have more exposure to respirable dust than the outdoor concrete workers. (d) Because , the procedures can be used for skewed data.

Question 7.60

7.60 Active companies versus failed companies.

CASE 7.2 Examples 7.14 and 7.15 (pages 390391) compare active and failed companies under the special assumption that the two populations of firms have the same standard deviation. In practice, we prefer not to make this assumption, so let’s analyze the data without making this assumption. We expect active firms to have a higher cash flow margins. Do the data give good evidence in favor of this expectation? By how much on the average does the cash flow margin for active firms exceed that for failed firms (use 99% confidence)?

cmps

Question 7.61

7.61 When is 30/31 days not equal to a month?

Time can be expressed on different levels of scale; days, weeks, months, and years. Can the scale provided influence perception of time? For example, if you placed an order over the phone, would it make a difference if you were told the package would arrive in four weeks or one month? To investigate this, two researchers asked a group of 267 college students to imagine their car needed major repairs and would have to stay at the shop. Depending on the group he or she was randomized to, the student was either told it would take one month or 30/31 days. Each student was then asked to give best- and worst-case estimates of when the car would be ready. The interval between these two estimates (in days) was the response. Here are the results:30

Group
30/31 days 177 20.4 14.3
One month 90 24.8 13.9
  1. Given that the interval cannot be less than 0, the distributions are likely skewed. Comment on the appropriateness of using the procedures.
  2. Test that the average interval is the same for the two groups using the significance level. Report the test statistic, the degrees of freedom, and the -value. Give a short summary of your conclusion.

7.61

(a) Because , we can use the procedures on skewed data. (b) . The data are significant at the 5% level, and there is evidence the means of the two groups are different. Those who are told 30/31 days have a smaller expectation interval on average than those who are told 1 month.

Question 7.62

7.62 When is 52 weeks not equal to a year?

Refer to the previous exercise. The researchers also had 60 marketing students read an announcement about a construction project. The expected duration was either one year or 52 weeks. Each student was then asked to state the earliest and latest completion date.

Group
52 weeks 30 84.1 55.8
1 year 30 139.6 73.1

Test that the average interval is the same for the two groups using the significance level. Report the test statistic, the degrees of freedom, and the -value. Give a short summary of your conclusion.

Question 7.63

7.63 Fitness and ego.

Employers sometimes seem to prefer executives who appear physically fit, despite the legal troubles that may result. Employers may also favor certain personality characteristics. Fitness and personality are related. In one study, middle-aged college faculty who had volunteered for a fitness program were divided into low-fitness and high-fitness groups based on a physical examination. The subjects then took the Cattell Sixteen Personality Factor Questionnaire.31 Here are the data for the “ego strength” personality factor:

ego

Low fitness High fitness
4.99 5.53 3.12 6.68 5.93 5.71
4.24 4.12 3.77 6.42 7.08 6.20
4.74 5.10 5.09 7.32 6.37 6.04
4.93 4.47 5.40 6.38 6.53 6.51
4.16 5.30 6.16 6.68
  1. Is the difference in mean ego strength significant at the 5% level? At the 1% level? Be sure to state and .
  2. Can you generalize these results to the population of all middle-aged men? Give reasons for your answer.
  3. Can you conclude that increasing fitness causes an increase in ego strength? Give reasons for your answer.

7.63

(a) . The data are significant at both the 5% and 1% levels, and there is evidence the two groups are different in mean ego strength. (b) No, they were all college faculty who volunteered and would not represent all middle-aged men. (c) No, the study was observational; we would need an experiment to show causation.

397

Question 7.64

7.64 Study design matters!

In the previous exercise, you analyzed data on the ego strength of high-fitness and low-fitness participants in a campus fitness program. Suppose that instead you had data on the ego strengths of the same men before and after six months in the program. You wonder if the program has affected their ego scores. Explain carefully how the statistical procedures you would use would differ from those you applied in Exercise 7.63.

Question 7.65

7.65 Sales of small appliances.

A market research firm supplies manufacturers with estimates of the retail sales of their products from samples of retail stores. Marketing managers are prone to look at the estimate and ignore sampling error. Suppose that an SRS of 70 stores this month shows mean sales of 53 units of a small appliance, with standard deviation 12 units. During the same month last year, an SRS of 58 stores gave mean sales of 50 units, with standard deviation 10 units. An increase from 50 to 53 is a rise of 6%. The marketing manager is happy, because sales are up 6%.

  1. Use the two-sample procedure to give a 95% confidence interval for the difference in mean number of units sold at all retail stores.
  2. Explain in language that the manager can understand why he cannot be confident that sales rose by 6%, and that in fact sales may even have dropped.

7.65

(a) (−0.91, 6.91). (b) With 95% confidence, the mean change in sales from last year to this year is between −0.91 and 6.91. Because the interval covers 0 and includes some negative values, it is possible sales have actually decreased.

Question 7.66

7.66 Compare two marketing strategies.

A bank compares two proposals to increase the amount that its credit card customers charge on their cards. (The bank earns a percentage of the amount charged, paid by the stores that accept the card.) Proposal A offers to eliminate the annual fee for customers who charge $3600 or more during the year. Proposal B offers a small percent of the total amount charged as a cash rebate at the end of the year. The bank offers each proposal to an SRS of 150 of its existing credit card customers. At the end of the year, the total amount charged by each customer is recorded. Here are the summary statistics:

Group
A 150 $3385 $468
B 150 $3124 $411
  1. Do the data show a significant difference between the mean amounts charged by customers offered the two plans? Give the null and alternative hypotheses, and calculate the two-sample statistic. Obtain the -value (either approximately from Table D or more accurately from software). State your practical conclusions.
  2. The distributions of amounts charged are skewed to the right, but outliers are prevented by the limits that the bank imposes on credit balances. Do you think that skewness threatens the validity of the test that you used in part (a)? Explain your answer.

Question 7.67

7.67 More on smart shopping carts.

Recall Example 7.10 (pages 381382). The researchers also had participants, who were not told they were on a budget, go through the same online grocery shopping exercise.

smart1

  1. For this set of participants, construct a table that includes the sample size, mean, and standard deviation of the total cost for the subset of participants with feedback and those without.
  2. Generate histograms or Normal quantile plots for each subset. Comment on the distributions and whether it is appropriate to use the procedures.
  3. Test that the average cost of the cart is the same for these two groups using the 0.05 significance level. Write a short summary of your findings. Make sure to compare them with the results in Example 7.10.

7.67

(a) For those with feedback, . For those without feedback, . (b) Both Normal quantile plots show the two variables are both roughly Normally distributed. (c) . The data are significant at the 5% level, and there is evidence the two groups are different in total cost for those with and without feedback among those who were not told they were on a budget. The results are similar to those in Example 7.10; feedback helped reduce spending.

Question 7.68

7.68 New hybrid tablet and laptop?

The purchasing department has suggested your company switch to a new hybrid tablet and laptop. As CEO, you want data to be assured that employees will like these new hybrids over the old laptops. You designate the next 14 employees needing a new laptop to participate in an experiment in which seven will be randomly assigned to receive the standard laptop and the remainder will receive the new hybrid tablet and laptop. After a month of use, these employees will express their satisfaction with their new computers by responding to the statement “I like my new computer” on a scale from 1 to 5, where 1 represents “strongly disagree,” 2 is “disagree,” 3 is “neutral,” 4 is “agree,” and 5 is “strongly agree.”

  1. The employees with the hybrid computers have an average satisfaction score of 4.2 with standard deviation 0.7. The employees with the standard laptops have an average of 3.4 with standard deviation 1.5. Give a 95% confidence interval for the difference in the mean satisfaction scores for all employees.
  2. Would you reject the null hypothesis that the mean satisfaction for the two types of computers is the same versus the two-sided alternative at significance level 0.05? Use your confidence interval to answer this question. Explain why you do not need to calculate the test statistic.

Question 7.69

7.69 Why randomize?

A coworker suggested that you give the new hybrid computers to the next seven employees who need new computers and the standard laptop to the following seven. Explain why your randomized design is better.

7.69

There could be things that are similar about the next 7 employees who need new computers as well as the following 7, which could bias the results (like being from the same office or department).

398

Question 7.70

7.70 Pooled procedures.

Refer to the previous two exercises. Reanalyze the data using the pooled procedure. Does the conclusion depend on the choice of method? The standard deviations are quite different for these data, so we do not recommend use of the pooled procedures in this case.

Question 7.71

7.71 Satterthwaite approximation.

The degrees of freedom given by the Satterthwaite approximation are always at least as large as the smaller of and and never larger than than the sum . In Exercise 7.53 (pages 394395), you were asked to compare the analyses with and without a very low decibel reading in the low-intensity group. Redo those analyses and make a table showing the sample sizes and , the standard deviations and , and the Satterthwaite degrees of freedom for each of these analyses. Based on these results, suggest when the Satterthwaite degrees of freedom will be closer to the smaller of and and when it will be closer to .

noise

7.71

When the standard deviations are similar, the Satterthwaite DF are closer to . When one standard deviation is much larger, the Satterthwaite DF is closer to the smaller of and .

Question 7.72

7.72 Pooled equals unpooled?

The software outputs in Figure 7.10 (pages 387388) give the same value for the pooled and unpooled statistics. Do some simple algebra to show that this is always true when the two sample sizes and are the same. In other cases, the two statistics usually differ.

Question 7.73

7.73 The advantage of pooling.

For the analysis of wheat prices in Example 7.13 (pages 385386), there are only five observations per month. When sample sizes are small, we have very little information to make a judgment about whether the population standard deviations are equal. The potential gain from pooling is large when the sample sizes are very small. Assume that we will perform a two-sided test using the 5% significance level.

wheat

  1. Find the critical value for the unpooled test statistic that does not assume equal variances. Use the minimum of and for the degrees of freedom.
  2. Find the critical value for the pooled test statistic.
  3. How does comparing these critical values show an advantage of the pooled test?

7.73

(a) . (b) . (c) Because the critical value is smaller for the pooled test, it is easier to show significance than the unpooled test.

Question 7.74

7.74 The advantage of pooling.

Suppose that in the setting of the previous exercise, you are interested in 95% confidence intervals for the difference rather than significance testing. Find the widths of the intervals for the two procedures (assuming or not assuming equal standard deviations). How do they compare?

wheat