23.8 A television poll. A television news program conducts a call-
565
23.9 Soccer salaries. Paris Saint-
23.10 How do we feel? A Gallup Poll taken in late October 2011 found that 58% of the American adults surveyed, reflecting on the day before they were surveyed, said they experienced a lot of happiness and enjoyment without a lot of stress and worry. The poll had a margin of sampling error of ±3 percentage points at 95% confidence. A news commentator at the time said the poll is surprising, given the current economic climate, in that it supports the notion that the majority of American adults are experiencing a lot of happiness and enjoyment without a lot of stress and worry. Why do you think the news commentator said this?
23.11 Legalizing marijuana. An October 2015 Gallup Poll found that 58% of the American adults surveyed support the legalization of marijuana in the United States. The poll had a margin of sampling error of ±4 percentage points at 95% confidence.
(a) Why do the results of this survey suggest that a majority of American adults support the legalization of marijuana in the United States?
(b) The survey was based on 1015 American adults. If we wanted the margin of sampling error to be ±2 percentage points at 95% confidence, how many Americans would need to be surveyed?
23.12 How far do rich parents take us? How much education children get is strongly associated with the wealth and social status of their parents. In social science jargon, this is “socioeconomic status,’’ or SES. But the SES of parents has little influence on whether children who have graduated from college go on to yet more education. One study looked at whether college graduates took the graduate admissions tests for business, law, and other graduate programs. The effects of the parents’ SES on taking the LSAT test for law school were “both statistically insignificant and small.’’
(a) What does “statistically insignificant’’ mean?
(b) Why is it important that the effects were small in size as well as insignificant?
23.13 Searching for ESP. A researcher looking for evidence of extrasensory perception (ESP) tests 200 subjects. Only one of these subjects does significantly better (P < 0.01) than random guessing.
(a) Do the results of this study provide strong evidence that this person has ESP? Explain your answer.
(b) What should the researcher now do to test whether the subject has ESP?
566
23.14 Are the drugs really effective? A March 29, 2012, article in the Columbus Dispatch reported that a former researcher at a major pharmaceutical company found that many basic studies on the effectiveness of new cancer drugs appeared to be unreliable. Among the studies the former researcher reviewed was one that had been published in a reputable journal. In this published study, a cancer drug was reported as having a statistically significant positive effect on treating cancer. For purposes of this problem, assume that statistically significant means significant at level 0.05.
(a) Explain in language that is understandable to someone who knows no statistics what “statistically significant at level 0.05’’ means.
(b) The former researcher interviewed the lead author of the published paper. The newspaper article reported that the lead author admitted that they had repeated their experiment six times and got a significant result only once but put it in the paper because it made the best story. In light of this admission, do you think that it is accurate to claim in their published study that the findings were significant at the 0.05 level? Explain your answer. (Note: A statistician can show that an event that has only probability 0.05 of occurring on any given trial will occur at least once in six trials with probability about 0.26.)
23.15 Comparing bottle designs. A company compares two designs for bottles of an energy drink by placing bottles with both designs on the shelves of several markets in a large city. Checkout scanner data on more than 10,000 bottles bought show that more shoppers bought Design A than Design B. The difference is statistically significant (P = 0.018). Can we conclude that consumers strongly prefer Design A? Explain your answer.
23.16 Color blindness in Africa. An anthropologist suspects that color blindness is less common in societies that live by hunting and gathering than in settled agricultural societies. He tests a number of adults in two populations in Africa, one of each type. The proportion of color-
23.17 Blood types in Southeast Asia. One way to assess whether two human groups should be considered separate populations is to compare their distributions of blood types. An anthropologist finds significantly different (P = 0.01) proportions of the main human blood types (A, B, AB, O) in different tribes in central Malaysia. What other information would you want before you agree that these tribes are separate populations?
23.18 Why we seek significance. Asked why statistical significance appears so often in research reports, a student says, “Because saying that results are significant tells us that they cannot easily be explained by chance variation alone.’’ Do you think that this statement is essentially correct? Explain your answer.
567
23.19 What is significance good for? Which of the following questions does a test of significance answer?
(a) Is the sample or experiment properly designed?
(b) Is the observed effect due to chance?
(c) Is the observed effect important?
23.20 What distinguishes those who have schizophrenia? Psychologists once measured 77 variables on a sample of people who had schizophrenia and a sample of people who did not have schizophrenia. They compared the two samples using 77 separate significance tests. Two of these tests were significant at the 5% level. Suppose that there is, in fact, no difference in any of the 77 variables between people who do and do not have schizophrenia in the adult population. That is, all 77 null hypotheses are true.
(a) What is the probability that one specific test shows a difference that is significant at the 5% level?
(b) Why is it not surprising that two of the 77 tests were significant at the 5% level?
23.21 Why are larger samples better? Statisticians prefer large samples. Describe briefly the effect of increasing the size of a sample (or the number of subjects in an experiment) on each of the following.
(a) The margin of error of a 95% confidence interval.
(b) The P-value of a test, when H0 is false and all facts about the population remain unchanged as n increases.
23.22 Is this convincing? You are planning to test a vaccine for a virus that now has no vaccine. Because the disease is usually not serious, you will expose 100 volunteers to the virus. After some time, you will record whether or not each volunteer has been infected.
(a) Explain how you would use these 100 volunteers in a designed experiment to test the vaccine. Include all important details of designing the experiment (but don’t actually do any random allocation).
(b) You hope to show that the vaccine is more effective than a placebo. State H0 and Ha. (Notice that this test compares two population proportions.)
(c) The experiment gave a P-value of 0.15. Explain carefully what this means.
(d) Your fellow researchers do not consider this evidence strong enough to recommend regular use of the vaccine. Do you agree?
The following two exercises are based on the optional section in this chapter.
23.23 In the courtroom. A criminal trial can be thought of as a decision problem, the two possible decisions being “guilty” and “not guilty.’’ Moreover, in a criminal trial there is a null hypothesis in the sense of an assertion that we will continue to hold until we have strong evidence against it. Criminal trials are, therefore, similar to hypothesis testing.
(a) What are H0 and Ha in a criminal trial? Explain your choice of H0.
(b) Describe in words the meaning of Type I error and Type II error in this setting, and display the possible outcomes in a diagram like Figures 23.3 and 23.4.
(c) Suppose that you are a jury member. Having studied statistics, you think in terms of a significance level α, the (subjective) probability of a Type I error. What considerations would affect your personal choice of α? (For example, would the difference between a charge of murder and a charge of shoplifting affect your personal α?)
568
23.24 Acceptance sampling. You are a consumer of potatoes in an acceptance sampling situation. Your acceptance sampling plan has probability 0.01 of passing a truckload of potatoes that does not meet quality standards. You might think that the truckloads that pass are almost all good. Alas, it is not so.
(a) Explain why low probabilities of error cannot ensure that truckloads that pass are mostly good. (Hint: What happens if your supplier ships all bad truckloads?)
(b) The paradox that most decisions can be correct (low error probabilities) and yet most truckloads that pass can be bad has important analogs in areas such as medical diagnosis. Explain why most conclusions that a patient has a rare disease can be false alarms even if the diagnostic system is correct 99% of the time.
The following exercises require carrying out the methods described in the optional sections of Chapters 21 and 22.
23.25 Do our athletes graduate? Return to the study in Exercise 22.19 (page 540), which found that 137 of 190 athletes admitted to a large university graduated within six years. The proportion of athletes who graduated was significantly lower (P = 0.033) than the 78% graduation rate for all students. It may be more informative to give a 95% confidence interval for the graduation rate of athletes. Do this.
23.26 Using the Internet. Return to the study in Exercise 22.20 (page 541), which found that 168 of an SRS of 200 entering students at a large state university said that they used the Internet frequently for research or homework. This differed significantly (P = 0.242) from the 81.8% of all first-
(a) It may be more informative to give a 95% confidence interval for the proportion of this university’s entering students who claim to use the Internet frequently for research or homework. Do this.
(b) Is the 81.8% national figure included in your confidence interval? Explain why you are or are not surprised by this.
(c) Suppose that 1680 of an SRS of 2000 entering students at a large state university had said that they used Internet frequently for research or homework. In this situation, the results would differ significantly (P < 0.001) from the 81.8% of all first-
23.27 Holiday spending. An October 2015 Gallup Poll asked a random sample of 1015 American adults “Roughly how much money do you think you personally will spend on Christmas gifts this year?” The mean holiday spending estimate in the sample was . We will treat these data as an SRS from a Normally distributed population with standard deviation .
(a) Give a 95% confidence interval for the mean holiday spending estimate based on these data.
(b) The mean holiday spending estimate of included 12% of respondents who answered “No opinion.” This group includes those respondents who do not celebrate Christmas and thus reported $0. Do you trust the interval you computed in part (a) as a 95% confidence interval for the mean holiday spending estimate for all American adults? Why or why not?
569
23.28 Is it significant? Over several years and many thousands of students, 85% of the high school students in a large city have passed the competency test that is one of the requirements for a diploma. Now reformers claim that a new mathematics curriculum will increase the percentage who pass. A random sample of 1000 students follow the new curriculum. The school board wants to see an improvement that is statistically significant at the 5% level before it will adopt the new program for all students. If p is the proportion of all students who would pass the exam if they followed the new curriculum, we must test
H0: p = 0.85
Ha: p > 0.85
(a) Suppose that 868 of the 1000 students in the sample pass the test. Show that this is not significant at the 5% level. (Follow the method of Example 3, page 530, in Chapter 22.)
(b) Suppose that 869 of the 1000 students pass. Show that this is significant at the 5% level.
(c) Is there a practical difference between 868 successes in 1000 tries and 869 successes? What can you conclude about the importance of a fixed significance level?
23.29 We like confidence intervals. The previous exercise compared significance tests about the proportion p of all students who would pass a competency test, based on data showing that either 868 or 869 of an SRS of 1000 students passed. Give the 95% confidence interval for p in both cases. The intervals make clear how uncertain we are about the true value of p and how little difference there is between the two sample outcomes.
EXPLORING THE WEB
Follow the QR code to access exercises.