6.118 Change in number insured.
The Wall Street Journal reported a Rand study on the estimated change in insured Americans from September 2013 to March 2014.23 Here is an excerpt:
… a net gain of 9.3 million people with coverage. That number came with a wide margin of error (3.5 million people), was driven largely by increased employer-based coverage, and didn’t fully capture the surge in enrollments that occurred in late March as the application deadline for Obamacare plans neared.
352
The reported margin of error is based on a 95% level of confidence. What is the 95% confidence interval for the change in people with coverage?
6.119 Coverage percent of 95% confidence interval.
For this exercise, use the Confidence Interval applet. Set the confidence level at 95%, and click the “Sample” button 10 times to simulate 10 confidence intervals. Record the percent hit (that is, percent of intervals including the population mean). Simulate another 10 intervals by clicking another 10 times (do not click the “Reset” button). Record the percent hit for your 20 intervals. Repeat the process of simulating 10 additional intervals and recording the results until you have a total of 200 intervals. Plot your results and write a summary of what you have found.
6.119
Applet, answers will vary.
6.120 Coverage percent of 90% confidence interval.
Refer to the previous exercise. Do the simulations and report the results for 90% confidence.
6.121 Change the confidence level.
Refer to Example 6.21 (page 329) and construct a 95% confidence interval for the mean initial return for the population of Chinese IPO firms.
6.121
(61.17, 71.43).
6.122 Job satisfaction.
A study of job satisfaction of Croatian employees was conducted on a research sample of 4000+ employees.24 The researcher developed a metric for overall job satisfaction based on the rating of numerous factors, including nature of work, top management, promotion, pay, status, working conditions, and others. The job satisfaction metric ranges from 1 to 5. Here is a table found in the report:
Mean | Standard deviation |
Standard error of mean |
||
---|---|---|---|---|
Men | 2261 | 3.4601 | 0.86208 | ? |
Women | 1975 | 3.5842 | 0.75004 | ? |
Given the large sample sizes, we can assume that the sample standard deviations are the population standard deviations.
6.123 Really small -value.
For Example 6.21 (page 329), we noted that the -value for testing the null hypothesis of is . Without calculation, we further noted that the -value is obviously much less than 0.001.
6.123
(a) 1.4929E-141.
6.124 Supply chain practices.
In a Stanford University study of supply chain practices, researcher gathered data on numerous companies and computed the correlations between various managerial practices and metrics on social responsibility.25 In the report, the researchers only report correlations that meet the following criteria: correlation value and -value . Why do you think the researchers ar not reporting statistically signifcant correlations that are less than 0.2?
6.125 Wine.
Many food products contain small quantities of substances that would give an undesirable taste or smell if they were present in large amounts. An example is the “off-odors” caused by sulfur compounds in wine. Oenologists (wine experts) have determined the odor threshold, the lowest concentration of a compound that the human nose can detect. For example, the odor threshold for dimethyl sulfde (DMS) is given in the oenology literature as 25 micrograms per liter of wine (). Untrained noses may be less sensitive, however. Here are the DMS odor thresholds for 10 beginning students of oenology:
31 | 31 | 43 | 36 | 23 | 34 | 32 | 30 | 20 | 24 |
Assume (this is not realistic) that the standard deviation of the odor threshold for untrained noses is known to be .
odor
353
6.125
(b) (26.06, 34.74). (c) , . The mean odor threshold for the beginning students is higher than the published threshold of 25.
6.126 Too much cellulose to be proftable?
Excess cellulose in alfalfa reduces the “relative feed value” of the product that will be fed to dairy cows. If the cellulose content is too high, the price will be lower and the producer will have less proft. An agronomist examines the cellulose content of one type of alfalfa hay. Suppose that the cellulose content in the population has standard deviation milligrams per gram (). A sample of 15 cuttings has mean cellulose content .
6.127 Where do you buy?
Consumers can purchase nonprescription medications at food stores, mass merchandise stores such as Kmart and Walmart, or pharmacies. About 45% of consumers make such purchases at pharmacies. What accounts for the popularity of pharmacies, which often charge higher prices?
A study examined consumers’ perceptions of overall performance of the three types of store using a long questionnaire that asked about such things as “neat and attractive store,” “knowledgeable staff,” and “assistance in choosing among various types of nonprescription medication.” A performance score was based on 27 such questions. The subjects were 201 people chosen at random from the Indianapolis telephone directory. Here are the means and standard deviations of the performance scores for the sample:26
Store type | ||
---|---|---|
Food stores | 18.67 | 24.95 |
Mass merchandisers | 32.38 | 33.37 |
Pharmacies | 48.60 | 35.62 |
We do not know the population standard deviations, but a sample standard deviation from so large a sample is usually close to . Use in place of the unknown in this exercise.
6.127
(a) The ideal population is all nonprescription medication customers. The actual population consists of those listed in the Indianapolis telephone directory. (b) Food stores: (15.22, 22.12), Mass merchandisers: (27.77, 36.99), Pharmacies: (43.68, 53.52). (c) Yes, the confidence interval for the pharmacies gives values much higher than in the other two intervals.
6.128 Using software on a data set.
Refer to Exercise 6.125 and the DMS odor threshold data. As noted in the exercise, assume . Read the data into statistical software, and obtain the 95% confidence interval for the mean DMS. Standard Excel does not provide an option for confidence intervals for the mean when is known.
odor
6.129 Using software with summary measures.
Most statistical software packages provide an option of find confidence interval limits by inputting the sample mean, sample size, population standard deviation, and desired confidence level.
6.129
(a) (18.49, 21.51). (b) (18.58, 21.42)
6.130 CEO pay.
A study of the pay of corporate chief executive officers (CEOs) examined the increase in cash compensation of the CEOs of 104 companies, adjusted for infation, in a recent year. The mean increase in real compensation was , and the standard deviation of the increases was . Is this good evidence that the mean real compensation of all CEOs increased that year? The hypotheses are
354
Because the sample size is large, the sample is close t the population , so take .
6.131 Large samples.
Statisticians prefer large samples. Describe briefly the effect of increasing the size of a sample (or the number of subjects in an experiment) on each of the following.
6.131
(a) The confidence interval gets narrower. (b) The -value gets smaller. (c) Power increases.
6.132 Roulette.
A roulette wheel has 18 red slots among its 38 slots. You observe many spins and record the number of times that red occurs. Now you want to use these data to test whether the probability of a red has the value that is correct for a fair roulette wheel. State the hypotheses and that you will test.
6.133 Signifcant.
When asked to explain the meaning of “statistically signifcant at the level,” a student says, “This means there is only probability 0.05 that the null hypothesis is true.” Is this a correct explanation of statistical significance? Explain your answer.
6.133
This student is wrong. means there is a 5% chance that we will incorrectly reject the null hypothesis.
6.134 Signifcant.
Another student, when asked why statistical significance appears so often in research reports, says, “Because saying that results are signifcant tells us that they cannot easily be explained by chance variation alone.” Do you think that this statement is essentially correct? Explain your answer.
6.135 Welfare reform.
A study compares two groups of mothers with young children who were on welfare two years ago. One group attended a voluntary training program offered free of charge at a local vocational school and advertised in the local news media. The other group did not choose to attend the training program. The study finds a signifcant difference () between the proportions of the mothers in the two groups who are still on welfare. The difference is not only signifcant but quite large. The report says that with 95% confidence the percent of the nonattending group still on welfare is higher than that of the group who attended the program. You are on the staff of a member of Congress who is interested in the plight of welfare mothers and who asks you about the report.
6.135
(a) The difference between the groups is so large that we do not believe it is attributed to chance. (b) 95% confidence means our results, in the long run, will be correct 95% of the time. (c) Not necessarily because there likely are lurking variables. For example, it is possible that those mothers willing to sign up for the training program are also more actively seeking employment, which could account for the difference.
6.136 Sample mean distribution.
Consider the following distribution for a discrete random variable :
−2 | −1 | 0 | 1 | |
1/4 | 1/4 | 1/4 | 1/4 |
Imagine a simple experiment of randomly generating a value for and recording it and then repeating a second time. Recognize that it is possible to get the same result on both trials. Finally, take the average of the two observed values.
6.137 Median statistic.
When a distribution is symmetric, the mean and median will equal. So, when sampling from a symmetric population, it would seem that we would be indifferent in using either the sample mean or sample median for estimating the population mean. Let’s explore this question by simulation. With software, you need to generate 1000 SRS based on from the standard Normal distribution. The easiest way to proceed is to create five adjacent columns of 1000 rows of random numbers from the standard Normal distribution.
355
For each row, find the mean and median of the five random observations. In JMP, define new columns using the formula editor, with the Mean function applied to the five columns and the Quantile function with the first argument as 0.5 and the other arguments being each of the five columns. In Minitab, this all can be done using the Row Statistics option found under Calc.
6.137
Answers will vary. (a) They both should be close to 0. (b) The theoretical standard deviation is 0.4472. The estimated standard deviation should be close to this number. (c) This will be somewhat higher than 0.4472. (d) The standard deviation of the median statistic is larger than the standard deviation of the mean statistic. (e) D is associated with the mean, and B is associated with the median.
356