Chapter 6: Introduction to Inference

SECTION 6.1 EXERCISES

For Exercises 6.1, 6.2, and 6.3, see pages 345–346; for Exercises 6.4 and 6.5, see pages 347–348; for Exercises 6.6, 6.7, and 6.8, see page 351; for Exercises 6.9 and 6.10, see page 354; and for Exercise 6.11, see page 356.

Question 6.12

6.12 Margin of error and the confidence interval. A study of stress on the campus of your university reported a mean stress level of 78 (on a 0 to 100 scale with a higher score indicating more stress) with a margin of error of 5 for 95% confidence. The study was based on a random sample of 64 undergraduates.

(a) Give the 95% confidence interval.
(b) If you wanted 99% confidence for the same study, would your margin of error be greater than, equal to, or less than 5? Explain your answer.

Question 6.13

6.13 Changing the sample size. Consider the setting of the previous exercise. Suppose that the sample mean is again 78 and the population standard deviation is 20. Make a diagram similar to Figure 6.5 (page 351) that illustrates the effect of sample size on the width of a 95% interval. Use the following sample sizes: 9, 25, 81, and 100. Summarize what the diagram shows.

6.13 The margins of error are 13.067, 7.84, 4.356, and 3.92. Interval width decreases as sample size increases.

Question 6.14

6.14 Changing the confidence level. Consider the setting of the previous two exercises. Suppose that the sample mean is still 78, the sample size is 64, and the population standard deviation is 20. Make a diagram similar to Figure 6.6 (page 353) that illustrates the effect of the confidence level on the width of the interval. Use 80%, 90%, 95%, and 99%. Summarize what the diagram shows.

Page 358

Question 6.15

6.15 Confidence interval mistakes and misunderstandings. Suppose that 500 randomly selected alumni of the University of Okoboji were asked to rate the university’s academic advising services on a 1 to 10 scale. The sample mean $\bar{x}$ was found to be 8.6. Assume that the population standard deviation is known to be σ = 2.2.

(a) Ima Bitlost computes the 95% confidence interval for the average satisfaction score as 8.6 ± 1.96(2.2). What is her mistake?
(b) After correcting her mistake in part (a), she states, “I am 95% confident that the sample mean falls between 8.4 and 8.8.” What is wrong with this statement?
(c) She quickly realizes her mistake in part (b) and instead states, “The probability that the true mean is between 8.4 and 8.8 is 0.95.” What misinterpretation is she making now?
(d) Finally, in her defense for using the Normal distribution to determine the confidence interval she says, “Because the sample size is quite large, the population of alumni ratings will be approximately Normal.” Explain to Ima her misunderstanding and correct this statement.

6.15 (a) She did not divide the standard deviation by $\sqrt{500} = 22.361.$ (b) Confidence intervals concern the population mean, not the sample mean. (c) 95% is a confidence level, not a probability. (d) The large sample size does not affect the distribution of individual alumni ratings.

Question 6.16

6.16 More confidence interval mistakes and misunderstandings. Suppose that 100 randomly selected members of the Karaoke Channel were asked how much time they typically spend on the site during the week.⁶ The sample mean $\bar{x}$ was found to be 3.8 hours. Assume that the population standard deviation is known to be σ = 2.9.

(a) Cary Oakey computes the 95% confidence interval for the average time on the site as 3.8 ± 1.96(2.9/100). What is his mistake?
(b) He corrects this mistake and then states that “95% of the members spend between 3.23 and 4.37 hours a week on the site.” What is wrong with his interpretation of this interval?
(c) The margin of error is slightly larger than half an hour. To reduce this to roughly 15 minutes, Cary says that the sample size needs to be doubled to 200. What is wrong with this statement?

Question 6.17

6.17 The state of stress in the United States. Since 2007, the American Psychological Association has supported an annual nationwide survey to examine stress across the United States.⁷ This year, a total of 720 millennials (18- to 33-year-olds) were asked to indicate their average stress level (on a 10-point scale) during the past month. The mean score was 5.5. Assume that the population standard deviation is 2.8.

(a) Give the margin of error and find the 95% confidence interval for this sample.
(b) Repeat these calculations for a 99% confidence interval. How do the results compare with those in part (a)?

6.17 (a) m = 0.2045. (5.295, 5.704). (b) (5.231, 5.769).

Question 6.18

6.18 Inference based on integer values. Refer to Exercise 6.17. The data for this study are integer values between 1 and 10. Explain why the confidence interval based on the Normal distribution should be a good approximation.

Question 6.19

6.19 Mean TRAP in young women. For many important processes that occur in the body, direct measurement of characteristics of the process is not possible. In many cases, however, we can measure a biomarker, a biochemical substance that is relatively easy to measure and is associated with the process of interest. Bone turnover is the net effect of two processes: the breaking down of old bone, called resorption, and the building of new bone, called formation. One biochemical measure of bone resorption is tartrate-resistant acid phosphatase (TRAP), which can be measured in blood. In a study of bone turnover in young women, serum TRAP was measured in 31 subjects.⁸ The mean was 13.2 units per liter (U/l). Assume that the standard deviation is known to be 6.5 U/l. Give the margin of error and find a 95% confidence interval for the mean TRAP amount in young women represented by this sample.

6.19 The margin of error is 2.29 U/l and the 95% confidence interval for $μ$ is 10.91 to 15.49 U/l.

Question 6.20

6.20 Mean OC in young women. Refer to the previous exercise. A biomarker for bone formation measured in the same study was osteocalcin (OC), measured in the blood. For the 31 subjects in the study, the mean was 33.4 nanograms per milliliter (ng/ml). Assume that the standard deviation is known to be 19.6 ng/ml. Report the 95% confidence interval.

Question 6.21

6.21 Populations sampled and margins of error. Consider the following two scenarios. (A) Take a simple random sample of 200 freshman students at your college or university. (B) Take a simple random sample of 200 students at your college or university. For each of these samples, you will record the amount spent on textbooks used for classes during the fall semester. Which sample should have the smaller margin of error? Explain your answer.

6.21 Scenario A has a smaller margin of error. The value of $σ$ would likely be smaller for A because we might expect less variability in textbook cost for freshman students than all students.

Question 6.22

6.22 Average starting salary. The National Association of Colleges and Employers (NACE) Spring Salary Survey shows that the current class of college graduates received an average starting-salary offer of $48,127.⁹ Your institution collected an SRS (n = 300) of its recent graduates and obtained a 95% confidence interval of ($46,382, $48,008). What can we conclude about the difference between the average starting salary of recent graduates at your institution and the overall NACE average? Write a short summary.

Question 6.23

6.23 Consumption of sweet snacks. A recent study reported that the U.S. per capita consumption of sweet snacks among healthy weight children aged 12 to 19 years is 251.2 kilocalories per day (kcal/d).¹⁰ This was based on 24-hour dietary recall records of n = 2265 adolescents.

Page 359

(a) Suppose that the population distribution is heavily skewed, with a standard deviation equal to 540 kcal/d. What is the margin of error for a 95% confidence interval of the per capita consumption of sweet snacks?
(b) A future study is being planned and the goal is to have the margin of error no more than 15 kcal/d. Based on your answer to part (a), will this study require an examination of more or fewer recall records? Explain your answer without calculations.
(c) Compute the sample size necessary for the study described in part (b).

6.23 (a) m = 22.24. (b) To yield a margin of error of 15, we would need a larger sample than 2265. (c) n = 4979.

Question 6.24

6.24 Total sleep time of college students. In Example 5.4 (page 293), the total sleep time per night among college students was approximately Normally distributed with mean μ = 6.78 hours and standard deviation σ = 1.24 hours. You initially plan to take an SRS of size n = 175 and compute the average total sleep time.

(a) What is the standard deviation for the average time in hours? in minutes?
(b) Use the 95 part of the 68–95–99.7 rule to describe the variability of this sample mean.
(c) What is the probability that your average will be below 6.9 hours?

Question 6.25

6.25 Determining sample size. Refer to the previous exercise. You really want to use a sample size such that about 95% of the averages fall within ±5 minutes of the true mean μ = 6.78.

(a) Based on your answer to part (b) in Exercise 6.24, should the sample size be larger or smaller than 175? Explain.
(b) What standard deviation of $\bar{x}$ do you need such that 95% of all samples will have a mean within 5 minutes of μ?
(c) Using the standard deviation you calculated in part (b), determine the number of students you need to sample.

6.25 (a) Larger. (b) We would need the standard deviation to be 0.04167 hours. (c) n = 886.

Question 6.26

6.26 Inference based on skewed data. The mean OC for the 31 subjects in Exercise 6.20 was 33.4 ng/ml. In our calculations, we assumed that the standard deviation was known to be 19.6 ng/ml. Use the 68–95–99.7 rule from Chapter 1 (page 57) to find the approximate bounds on the values of OC that would include these percents of the population. If the assumed standard deviation is correct, this distribution may be highly skewed. Why? (Hint: The measured values for a variable such as this are all positive.) Do you think that this skewness will invalidate the use of the Normal confidence interval in this case? Explain your answer.

Question 6.27

6.27 Average hours per week listening to the radio. The Student Monitor surveys 1200 undergraduates from four-year colleges and universities throughout the United States semiannually to understand trends among college students.¹¹ Recently, the Student Monitor reported that the average amount of time listening to the radio per week was 11.5 hours. Of the 1200 students surveyed, 83% said that they listened to the radio, so this collection of listening times has around 204 (17% × 1200) zeros. Assume that the standard deviation is 8.3 hours.

(a) Give a 95% confidence interval for the mean time spent per week listening to the radio.
(b) Is it true that 95% of the 1200 students reported weekly times that lie in the interval you found in part (a)? Explain your answer.
(c) It appears that the population distribution has many zeros and is skewed to the right. Explain why the confidence interval based on the Normal distribution should nevertheless be a good approximation.

6.27 (a) The 95% confidence interval for the mean number of hours spent listening to the radio in a week is 11.03 to 11.97 hours. (b) No. This is a range of values for the mean time spent, not for individual times. (See also the comment in the solution to Exercise 6.25.) (c) The sample size is large (n = 1200 students surveyed).

Question 6.28

6.28 Average minutes per week listening to the radio. Refer to the previous exercise.

(a) Give the mean and standard deviation in minutes.
(b) Calculate the 95% confidence interval in minutes from your answer to part (a).
(c) Explain how you could have directly calculated this interval from the 95% interval that you calculated in the previous exercise.

Question 6.29

6.29 Outlook on life. Since 2008, the Gallup-Healthways Well-Being Index tracks how people feel about their daily lives. In 2014, 54.1% of the respondents were classified as “thriving.” This classification is based on how a respondent rates his or her current and future lives. This is the highest percent of respondents in this category since the index started. Material provided with the results noted:

Results are based on telephone interviews . . . with a random sample of 176,903 adults, living in all 50 U.S. states and the District of Columbia. For results based on the total sample of national adults, the margin of sampling error is ±1 percentage points at the 95% confidence level.¹²

The poll uses a complex multistage sample design, but the sample percent has approximately a Normal sampling distribution.

(a) The announced poll result was 54.1% ± 1%. Can we be certain that the true population percent falls in this interval? Explain your answer.
(b) Explain to someone who knows no statistics what the announced result 54.1% ± 1% means.
Page 360

(c) This confidence interval has the same form we have met earlier:

estimate ± z* σ_estimate

What is the standard deviation σ_estimate of the estimated percent?
(d) Does the announced margin of error include errors due to practical problems such as nonresponse? Explain your answer.

6.29 (a) We can be 95% confident, but not certain. (b) We obtained the interval 53.1% to 55.1% by a method that gives a correct result 95% of the time. (c) The margin of error is about 0.51%. (d) No, confidence intervals only account for random sampling error.

Question 6.30

6.30 Fuel efficiency. Computers in some vehicles calculate various quantities related to performance. One of these is the fuel efficiency, or gas mileage, usually expressed as miles per gallon (mpg). For one vehicle equipped in this way, the miles per gallon were recorded each time the gas tank was filled, and the computer was then reset.¹³ Here are the mpg values for a random sample of 20 of these records:

41.5	50.7	36.6	37.3	34.2	45.0	48.0	43.2	47.7	42.2
43.2	44.6	48.4	46.4	46.8	39.2	37.3	43.5	44.3	43.3

Suppose that the standard deviation is known to be σ = 3.5 mpg.

(a) What is $σ_{\bar{x}}$ , the standard deviation of $\bar{x}$ ?
(b) Examine the data for skewness and other signs of non-Normality. Show your plots and numerical summaries. Do you think it is reasonable to construct a confidence interval based on the Normal distribution? Explain your answer.
(c) Give a 95% confidence interval for μ, the mean miles per gallon for this vehicle.

Question 6.31

6.31 Fuel efficiency in metric units. In the previous exercise, you found an estimate with a margin of error for the average miles per gallon. Convert your estimate and margin of error to the metric units kilometers per liter (kpl). To change mpg to kpl, use the fact that 1 mile = 1.609 kilometers and 1 gallon = 3.785 liters.

6.31 ${\bar{x}}_{kpl} = 18.3515$ and margin of error 0.6521 kpl.

Question 6.32

6.32 How many “hits”? The Confidence Interval applet lets you simulate large numbers of confidence intervals quickly. Select 95% confidence and then sample 50 intervals. Record the number of intervals that cover the true value (this appears in the “Hit” box in the applet). Press the “Reset” button and repeat 30 times. Make a stemplot of the results and find the mean. Describe the results. If you repeated this experiment very many times, what would you expect the average number of hits to be?

Question 6.33

6.33 Required sample size for specified margin of error. A new bone study is being planned that will measure the biomarker TRAP described in Exercise 6.19. Using the value of σ given there, 6.5 U/l, find the sample size required to provide an estimate of the mean TRAP with a margin of error of 1.5 U/l for 95% confidence.

6.33 n = 73.

Question 6.34

6.34 Adjusting required sample size for dropouts. Refer to the previous exercise. In similar previous studies, about 20% of the subjects drop out before the study is completed. Adjust your sample size requirement so that you will have enough subjects at the end of the study to meet the margin of error criterion.

Question 6.35

6.35 Radio poll. A national public radio (NPR) station invites listeners to enter a dispute about a proposed “pay as you throw” waste collection program. The station asks listeners to call in and state how much each 10 gallon bag of trash should cost. A total of 179 listeners call in. The station calculates the 95% confidence interval for the average fee to be $0.53 to $1.39. Is this result trustworthy? Explain your answer.

6.35 No; confidence interval methods of this chapter can only be used on an SRS.

Question 6.36

6.36 Accuracy of a laboratory scale. To assess the accuracy of a laboratory scale, a standard weight known to weigh 10 grams is weighed repeatedly. The scale readings are Normally distributed with unknown mean (this mean is 10 grams if the scale has no bias). The standard deviation of the scale readings is known to be 0.0002 gram.

(a) The weight is measured six times. The mean result is 10.0023 grams. Give a 99% confidence interval for the mean of repeated measurements of the weight.
(b) Based on the interval in part (a), do you think the scale is accurate? Explain your answer.
(c) How many measurements must be averaged to get a margin of error of ±0.0001 with 99% confidence?

Question 6.37

6.37 More than one confidence interval. As we prepare to take a sample and compute a 95% confidence interval, we know that the probability that the interval we compute will cover the parameter is 0.95. That’s the meaning of 95% confidence. If we plan to use several such intervals, however, our confidence that all of them will give correct results is less than 95%. Suppose that we plan to take independent samples each month for five months and report a 95% confidence interval for each set of data.

(a) What is the probability that all five intervals will cover the true means? This probability (expressed as a percent) is our overall confidence level for the five simultaneous statements.
(b) Suppose we instead considered individual 99% confidence intervals. Now, what is the overall confidence level for the five simultaneous statements?
(c) Based on the results of parts (a) and (b), how could you keep the overall confidence level near 95% if you were considering 10 simultaneous intervals?

6.37 (a) 0.7738 (b) 0.9510. (c) 0.99488 or about 99.5%.