Chapter 7: Inference for Means

CHAPTER 7 EXERCISES

Question 7.119

7.119 LSAT scores. The scores of four senior roommates on the Law School Admission Test (LSAT) are

153 162 166 133

Find the mean, the standard deviation, and the standard error of the mean. Is it appropriate to calculate a confidence interval based on these data? Explain why or why not.

Question 7.120

7.120 Converting a two-sided P-value. You use statistical software to perform a significance test of the null hypothesis that two means are equal. The software reports a P-value for the two-sided alternative. Your alternative is that the first mean is greater than the second mean.

(a) The software reports t = 1.85 with a P-value of 0.075. Would you reject H₀ at α = 0.05? Explain your answer.
(b) The software reports t =−1.85 with a P-value of 0.075. Would you reject H₀ at α = 0.05? Explain your answer.

Question 7.121

7.121 Degrees of freedom and t*. As the degrees of freedom increase, the t distributions get closer and closer to the z (N(0, 1)) distribution. One way to see this is to look at how the value of t* for a 95% confidence interval changes with the degrees of freedom.

(a) Make a plot with degrees of freedom from 10 to 100 by 10 on the x axis and t* on the y axis. Also draw a horizontal line on the plot corresponding to the value of z* = 1.96.
(b) Summarize the main features of the plot.
(c) Describe how this plot would change if you considered a 90% confidence interval.

477

Question 7.122

7.122 Sample size and margin of error. The margin of error for a confidence interval for μ depends on the confidence level, the sample standard deviation s, and the sample size. Fix the confidence level at 95% and the sample standard deviation at s = 1 to examine the effect of the sample size. Find the margin of error for sample sizes of 11 to 101 by 10s—that is, let n = 11, 21, 31, . . . , 101. Plot the margins of error versus the sample size and summarize the relationship.

Question 7.123

7.123 Which design? The following situations all require inference about a mean or means. Identify each as (1) a single sample, (2) matched pairs, or (3) two independent samples. Explain your answers.

(a) Your customers are college students. You are interested in comparing the interest in a new product that you are developing between those students who live in the dorms and those who live elsewhere.
(b) Your customers are college students. You are interested in finding out which of two new product labels is more appealing.
(c) Your customers are college students. You are interested in assessing their interest in a new product.

Question 7.124

7.124 Which design? The following situations all require inference about a mean or means. Identify each as (1) a single sample, (2) matched pairs, or (3) two independent samples. Explain your answers.

(a) You want to estimate the average age of your store’s customers.
(b) You do an SRS survey of your customers every year. One of the questions on the survey asks about customer satisfaction on a seven-point scale with the response 1 indicating “very dissatisfied’’ and 7 indicating “very satisfied.’’ You want to see if the mean customer satisfaction has improved from last year.
(c) You ask an SRS of customers their opinions on each of two new floor plans for your store.

Question 7.125

7.125 Number of critical food violations. The results of a major city’s restaurant inspections are available through its online newspaper.⁴⁵ Critical food violations are those that put patrons at risk of getting sick and must immediately be corrected by the restaurant. An SRS of n = 200 inspections from the more than 16,000 inspections since January 2012 were collected, resulting in violations and s = 1.822 violations.

(a) Test the hypothesis that the average number of critical violations is less than 1.5 using a significance level of 0.05. State the two hypotheses, the test statistic, and P-value.
(b) Construct a 95% confidence interval for the average number of critical violations and summarize your result.
(c) Which of the two summaries (significance test versus confidence interval) do you find more helpful in this case? Explain your answer.
(d) These data are integers ranging from 0 to 10. The data are also skewed to the right, with 79% of the values either a 0 or a 1. Given this information, do you think use of the t procedures is appropriate? Explain your answer.

Question 7.126

7.126 Two-sample t test versus matched pairs t test. Consider the following data set. The data were actually collected in pairs, and each row represents a pair.

Group 1	Group 2
48.86	48.88
50.60	52.63
51.02	52.55
47.99	50.94
54.20	53.02
50.66	50.66
45.91	47.78
48.79	48.44
47.76	48.92
51.13	51.63

(a) Suppose that we ignore the fact that the data were collected in pairs and mistakenly treat this as a two-sample problem. Compute the sample mean and variance for each group. Then compute the two-sample t statistic, degrees of freedom, and P-value for the two-sided alternative.
(b) Now analyze the data in the proper way. Compute the sample mean and variance of the differences. Then compute the t statistic, degrees of freedom, and P-value.
(c) Describe the differences in the two test results.

Question 7.127

7.127 Two-sample t test versus matched pairs t test, continued. Refer to the previous exercise. Perhaps an easier way to see the major difference in the two analysis approaches for these data is by computing 95% confidence intervals for the mean difference.

(a) Compute the 95% confidence interval using the two-sample t confidence interval.
(b) Compute the 95% confidence interval using the matched pairs t confidence interval.
(c) Compare the estimates (that is, the centers of the intervals) and margins of error. What is the major difference between the two approaches for these data?

478

Question 7.128

7.128 Average service time. Recall the drive-thru study in Exercise 7.69 (page 457). Another benchmark that was measured was the service time. A summary of the results (in seconds) for two of the chains is shown below.

Chain	n		s
Taco Bell	308	158.03	33.8
McDonald’s	317	189.49	41.3

(a) Is there a difference in the average service time between these two chains? Test the null hypothesis that the chains’ average service time is the same. Use a significance level of 0.05.
(b) Construct a 95% confidence interval for the difference in average service time.
(c) Lex plans to go to Taco Bell and Sam to McDonald’s. Does the interval in part (b) contain the difference in their service times that they’re likely to encounter? Explain your answer.

Question 7.129

7.129 Interracial friendships in college. A study utilized the random roommate assignment process of a small college to investigate the interracial mix of friends among students in college.⁴⁶ As part of this study, the researchers looked at 238 white students who were randomly assigned a roommate in their first year and recorded the proportion of their friends (not including the first-year roommate) who were black. The following table summarizes the results, broken down by roommate race, for the middle of the first and third years of college.

Middle of First Year
Randomly assigned	n		s
Black roommate	41	0.085	0.134
White roommate	197	0.063	0.112

Middle of Third Year
Randomly assigned	n		s
Black roommate	41	0.146	0.243
White roommate	197	0.062	0.154

(a) Proportions are not Normally distributed. Explain why it may still be appropriate to use the t procedures for these data.
(b) For each year, state the null and alternative hypotheses for comparing these two groups.
(c) For each year, perform the significance test at the α = 0.05 level, making sure to report the test statistic, degrees of freedom, and P-value.
(d) Write a one-paragraph summary of your conclusions from these two tests.

Question 7.130

7.130 Interracial friendships in college, continued. Refer to the previous exercise. For each year, construct a 95% confidence interval for the difference in means μ₁ −μ₂ and describe how these intervals can be used to test the null hypotheses in part (b) of the previous exercise.

Question 7.131

7.131 Alcohol consumption and body composition. Individuals who consume large amounts of alcohol do not use the calories from this source as efficiently as calories from other sources. One study examined the effects of moderate alcohol consumption on body composition and the intake of other foods. Fourteen subjects participated in a crossover design where they either drank wine for the first six weeks and then abstained for the next six weeks or vice versa.⁴⁷ During the period when they drank wine, the subjects, on average, lost 0.4 kilogram (kg) of body weight; when they did not drink wine, they lost an average of 1.1 kg. The standard deviation of the difference between the weight lost under these two conditions is 8.6 kg. During the wine period, they consumed an average of 2589 calories; with no wine, the mean consumption was 2575. The standard deviation of the difference was 210.

(a) Compute the differences in means and the standard errors for comparing body weight and caloric intake under the two experimental conditions.
(b) A report of the study indicated that there were no significant differences in these two outcome measures. Verify this result for each measure, giving the test statistic, degrees of freedom, and the P-value.
(c) One concern with studies such as this, with a small number of subjects, is that there may not be sufficient power to detect differences that are potentially important. Address this question by computing 95% confidence intervals for the two measures and discuss the information provided by the intervals.
(d) Here are some other characteristics of the study. The study periods lasted for six weeks. All subjects were males between the ages of 21 and 50 years who weighed between 68 and 91 kg. They were all from the same city. During the wine period, subjects were told to consume two 135-milliliter (ml) servings of red wine per day and no other alcohol. The entire six-week supply was given to each subject at the beginning of the period. During the other period, subjects were instructed to refrain from any use of alcohol. All subjects reported that they complied with these instructions except for three subjects, who said that they drank no more than three to four 12-ounce bottles of beer during the no-alcohol period. Discuss how these factors could influence the interpretation of the results.

479

Question 7.132

7.132 The wine makes the meal? In one study, 39 diners were given a free glass of cabernet sauvignon wine to accompany a French meal.⁴⁸ Although the wine was identical, half the bottle labels claimed the wine was from California and the other half claimed it was from North Dakota. The following table summarizes the grams of entrée and wine consumed during the meal.

	Wine label	n	Mean	St. Dev
Entrée	California	24	499.8	87.2
	North Dakota	15	439.0	89.2
Wine	California	24	100.8	23.3
	North Dakota	15	110.4	9.0

Did the patrons who thought that the wine was from California consume more? Analyze the data and write a report summarizing your work. Be sure to include details regarding the statistical methods you used, your assumptions, and your conclusions.

Question 7.133

7.133 Can mockingbirds learn to identify specific humans? A central question in urban ecology is why some animals adapt well to the presence of humans and others do not. The following results summarize part of a study of the northern mockingbird (Mimus polyglottos) that took place on a campus of a large university.⁴⁹ For four consecutive days, the same human approached a nest and stood 1 meter away for 30 seconds, placing his or her hand on the rim of the nest. On the fifth day, a new person did the same thing. Each day, the distance of the human from the nest when the bird flushed was recorded. This was repeated for 24 nests. The human intruder varied his or her appearance (that is, wore different clothes) over the four days. We report results for only Days 1, 4, and 5 here. The response variable is flush distance measured in meters.

Day	Mean	s
1	6.1	4.9
4	15.1	7.3
5	4.9	5.3

(a) Explain why this should be treated as a matched design.
(b) Unfortunately, the research article does not provide the standard error of the difference, only the standard error of the mean flush distance for each day. However, we can use the general addition rule for variances (page 258) to approximate it. If we assume that the correlation between the flush distance at Day 1 and Day 4 for each nest is ρ = 0.40, what is the standard deviation for the difference in distance?
(c) Using your result in part (b), test the hypothesis that there is no difference in the flush distance across these two days. Use a significance level of 0.05.
(d) Repeat parts (b) and (c) but now compare Day 1 and Day 5, assuming a correlation between flush distances for each nest of ρ = 0.30.
(e) Write a brief summary of your conclusions.

Question 7.134

7.134 Sign test for assessment of a foreign-language institute. Use the sign test to assess whether the summer institute of Exercise 7.47 (page 432) improves French listening skills. State the hypotheses, give the P-value using the binomial table (Table C), and report your conclusion.

Question 7.135

7.135 Study design information. Refer to Exercise 7.132. In this study, diners were seated alone or in groups of two, three, four, and, in one case, nine (for a total of n = 16 tables). Also, each table, not each patron, was randomly assigned a particular wine label. Does this information alter how you might do the analysis in the previous problem? Explain your answer.

Question 7.136

7.136 Analysis of tree size using the complete data set. The data used in Exercises 7.33 (page 429), 7.79, and 7.80 (page 459) were obtained by taking SRSs from the 584 longleaf pine trees that were measured in the Wade Tract. The entire data set is given in the WADE data set. Find the 95% confidence interval for the mean DBH using the entire data set, and compare this interval with the one that you calculated in Exercise 7.33. Write a report about these data. Include comments on the effect of the sample size on the margin of error, the distribution of the data, the appropriateness of the Normality-based methods for this problem, and the generalizability of the results to other similar stands of longleaf pine or other kinds of trees in this area of the United States and other areas.

Question 7.137

7.137 Can snobby salespeople boost retail sales? Researchers asked 180 women to read a hypothetical shopping experience where they entered a luxury store (e.g., Louis Vuitton, Gucci, Burberry) and ask a salesperson for directions to the items they seek. For half the women, the salesperson was condescending while doing this. The other half were directed in a neutral manner. After reading the experience, participants were asked various questions, including what price they were willing to pay (in dollars) for a particular product from the brand.⁵⁰ Here is a summary of the results.

Chain	n		s
Condescending	90	4.44	3.98
Neutral	90	3.95	2.88

Were the participants who were treated rudely willing to pay more for the product? Analyze the data, and write a report summarizing your work. Be sure to include details regarding the statistical methods you used, your assumptions, and your conclusions.

480

Question 7.138

7.138 A comparison of female high school students. A study was performed to determine the prevalence of the female athlete triad (low energy availability, menstrual dysfunction, and low bone mineral density) in high school students.⁵¹ A total of 80 high school athletes and 80 sedentary students were assessed. The following table summarizes several measured characteristics:

	Athletes		Sedentary
Characteristic		s		s
Body fat (%)	25.61	5.54	32.51	8.05
Body mass index	21.60	2.46	26.41	2.73
Calcium deficit (mg)	297.13	516.63	580.54	372.77
Glasses of milk/day	2.21	1.46	1.82	1.24

(a) For each of the characteristics, test the hypothesis that the means are the same in the two groups. Use a significance level of 0.05 for each test.
(b) Write a short report summarizing your results.

Question 7.139

7.139 More on snobby salespeople. Refer to Exercise 7.137. Researchers also asked a different 180 women to read the same hypothetical shopping experience, but now they entered a mass market (e.g., Gap, American Eagle, H&M). Here are those results (in dollars) for the two conditions:

Chain	n		s
Condescending	90	2.90	3.28
Neutral	90	2.98	3.24

Question 7.140

7.140 Transforming the response. Refer to Exercises 7.137 and 7.139. The researchers state that they took the natural log of the willingness to pay variable in order to “normalize the distribution’’ prior to analysis. Thus, their test results are based on log dollar measurements. For the t procedures used in the previous two exercises, do you feel this transformation is necessary? Explain your answer.

Question 7.141

7.141 Competitive prices? A retailer entered into an exclusive agreement with a supplier who guaranteed to provide all products at competitive prices. The retailer eventually began to purchase supplies from other vendors who offered better prices. The original supplier filed a legal action claiming violation of the agreement. In defense, the retailer had an audit performed on a random sample of invoices. For each audited invoice, all purchases made from other suppliers were examined and the prices were compared with those offered by the original supplier. For each invoice, the percent of purchases for which the alternate supplier offered a lower price than the original supplier was recorded.⁵² Here are the data:

0	100	0	100	33	34	100	48	78	100	77	100	38
68	100	79	100	100	100	100	100	100	89	100	100

Report the average of the percents with a 95% margin of error. Do the sample invoices suggest that the original supplier’s prices are not competitive on the average?

Question 7.142

7.142 Weight-loss programs. In a study of the effectiveness of weight-loss programs, 47 subjects who were at least 20% overweight took part in a group support program for 10 weeks. Private weighings determined each subject’s weight at the beginning of the program and six months after the program’s end. The matched pairs t test was used to assess the significance of the average weight loss. The paper reporting the study said, “The subjects lost a significant amount of weight over time, t(46) = 4.68, p < 0.01.’’ It is common to report the results of statistical tests in this abbreviated style.⁵³

(a) Why was the matched pairs statistic appropriate?
(b) Explain to someone who knows no statistics but is interested in weight-loss programs what the practical conclusion is.
(c) The paper follows the tradition of reporting significance only at fixed levels such as α = 0.01. In fact, the results are more significant than “p < 0.01’’ suggests. What can you say about the P-value of the t test?

Question 7.143

7.143 Do women perform better in school? Some research suggests that women perform better than men in school, but men score higher on standardized tests. Table 1.3 (page 26) presents data on a measure of school performance, grade point average (GPA), and a standardized test, IQ, for 78 seventh-grade students. Do these data lend further support to the previously found gender differences? Give graphical displays of the data and describe the distributions. Use significance tests and confidence intervals to examine this question, and prepare a short report summarizing your findings.

Question 7.144

7.144 Self-concept and school performance. Refer to the previous exercise. Although self-concept in this study was measured on a scale with values in the data set ranging from 20 to 80, many prefer to think of this kind of variable as having only two possible values: low self-concept or high self-concept. Find the median of the self-concept scores in Table 1.3, and define those students with scores at or below the median to be low-self-concept students and those with scores above the median to be high-self-concept students. Do high-self-concept students have GPAs that differ from those of low-self-concept students? What about IQ? Prepare a report addressing these questions. Be sure to include graphical and numerical summaries and confidence intervals, and state clearly the details of significance tests.

481

Question 7.145

7.145 Behavior of pet owners. On the morning of March 5, 1996, a train with 14 tankers of propane derailed near the center of the small Wisconsin town of Weyauwega. Six of the tankers were ruptured and burning when the 1700 residents were ordered to evacuate the town. Researchers study disasters like this so that effective relief efforts can be designed for future disasters. About half the households with pets did not evacuate all their pets. A study conducted after the derailment focused on problems associated with retrieval of the pets after the evacuation and characteristics of the pet owners. One of the scales measured “commitment to adult animals,’’ and the people who evacuated all or some of their pets were compared with those who did not evacuate any of their pets. Higher scores indicate that the pet owner is more likely to take actions that benefit the pet.⁵⁴ Here are the data summaries:

Group	n		s
Evacuated all or some pets	116	7.95	3.62
Did not evacuate any pets	125	6.26	3.56

Analyze the data and prepare a short report describing the results.

Question 7.146

7.146 Sample size calculation. Example 7.10 (page 434) tells us that the mean height of 10-year-old girls is N(56.9, 2.8) and for boys it is N(56.0, 3.5). The null hypothesis that the mean heights of 10-year-old boys and girls are equal is clearly false. The difference in mean heights is 56.9 − 56.0 = 0.9 inch. Small differences such as this can require large sample sizes to detect. To simplify our calculations, let’s assume that the standard deviations are the same—say, σ = 3.2—and that we will measure the heights of an equal number of girls and boys. How many would we need to measure to have a 90% chance of detecting the (true) alternative hypothesis?

Question 7.147

7.147 Different methods of teaching reading. In the READ data set, the response variable Post3 is to be compared for three methods of teaching reading. The Basal method is the standard, or control, method, and the two new methods are DRTA and Strat. We can use the methods of this chapter to compare Basal with DRTA and Basal with Strat. Note that to make comparisons among three treatments it is more appropriate to use the procedures that we will learn in Chapter 12.

(a) Is the mean reading score with the DRTA method higher than that for the Basal method? Perform an analysis to answer this question, and summarize your results.
(b) Answer part (a) for the Strat method in place of DRTA.

Question 7.148

7.148 Designing a new stress management survey. Refer to Exercise 6.17 (page 358). Suppose you want to draw a new SRS of millenials such that the expected margin of error with 99% confidence is 0.2 points. What sample size do you need?

Question 7.149

7.149 Conditions for inference. Suppose that your state contains 85 school corporations and each corporation reports its expenditures per pupil. Is it proper to apply the one-sample t method to these data to give a 95% confidence interval for the average expenditure per pupil? Explain your answer.

0	100	0	100	33	34	100	48	78	100	77	100	38
68	100	79	100	100	100	100	100	100	89	100	100

0	100	0	100	33	34	100	48	78	100	77	100	38
68	100	79	100	100	100	100	100	100	89	100	100

0	100	0	100	33	34	100	48	78	100	77	100	38
68	100	79	100	100	100	100	100	100	89	100	100