For Exercises 7.1 and 7.2, see page 410; for Exercises 7.3 and 7.4, see page 412; for Exercises 7.5 and 7.6, see page 415; for Exercises 7.7 and 7.8, see page 418; for Exercises 7.9 through 7.11, see pages 422–423; and for Exercises 7.12 and 7.13, see page 424.
7.14 What is wrong? In each of the following situations, identify what is wrong and then either explain why it is wrong or change the wording of the statement to make it true.
(a) As the degrees of freedom k decrease, the t distribution density curve gets closer to the curve.
(b) The standard error of the sample mean is .
(c) A researcher wants to test versus the one-sided alternative .
(d) The 95% margin of error for the mean μ of a Normal population with unknown σ is the same for all SRS of size n.
7.15 Finding the critical value t*. What critical value t* from Table D should be used to calculate the margin of error for a confidence interval for the mean of the population in each of the following situations?
(a) A 95% confidence interval based on n = 15 observations.
(b) A 95% confidence interval from an SRS of 28 observations.
(c) A 90% confidence interval from a sample of size 28.
(d) These cases illustrate how the size of the margin of error depends upon the confidence level and the sample size. Summarize these relationships.
7.16 Distribution of the t statistic. Assume a sample size of . Draw a picture of the distribution of the t statistic under the null hypothesis. Use Table D and your picture to illustrate the values of the test statistic that would lead to rejection of the null hypothesis at the 5% level for a two-sided alternative.
7.17 More on the distribution of the t statistic. Repeat the previous exercise for the two situations where the alternative is one-sided.
7.18 One-sided versus two-sided P-values. Computer software reports and for a t test of versus . Based on prior knowledge, you justified testing the alternative . What is the P-value for your significance test?
427
7.19 More on one-sided versus two-sided P-values. Suppose that computer software reports and for a t test of versus . Would this change your P-value for the alternative hypothesis in the previous exercise? Use a sketch of the distribution of the test statistic under the null hypothesis to illustrate and explain your answer.
7.20 A one-sample t test. The one-sample t statistic for testing
H0: μ = 8
Ha: μ > 8
from a sample of n = 22 observations has the value t = 2.24.
(a) What are the degrees of freedom for this statistic?
(b) Give the two critical values t* from Table D that bracket t.
(c) Between what two values does the P-value of the test fall?
(d) Is the value significant at the 5% level? Is it significant at the 1% level?
(e) If you have software available, find the exact P-value.
7.21 Another one-sample t test. The one-sample t statistic for testing
H0: μ = 40
Ha: μ ≠ 40
from a sample of observations has the value .
(a) What are the degrees of freedom for t?
(b) Locate the two critical values t* from Table D that bracket t.
(c) Between what two values does the P-value of the test fall?
(d) Is the value statistically significant at the 5% level? At the 1% level?
(e) If you have software available, find the exact P-value.
7.22 A final one-sample t test. The one-sample t statistic for testing
H0: μ = 20
Ha: μ < 20
based on observations has the value .
(a) What are the degrees of freedom for this statistic?
(b) Between what two values does the P-value of the test fall?
(c) If you have software available, find the exact P-value.
7.23 Two-sided to one-sided P-value. Most software gives P-values for two-sided alternatives. Explain why you cannot always divide these P-values by 2 to obtain P-values for one-sided alternatives.
7.24 Business bankruptcies in Canada. Business bankruptcies in Canada are monitored by the Office of the Superintendent of Bankruptcy Canada (OSB).8 Included in each report are the assets and liabilities the company declared at the time of the bankruptcy filing. A study is based on a random sample of 75 reports from the current year. The average debt (liabilities minus assets) is $92,172 with a standard deviation of $111,538.
(a) Construct a 95% one-sample t confidence interval for the average debt of these companies at the time of filing.
(b) Because the sample standard deviation is larger than the sample mean, this debt distribution is skewed. Provide a defense for using the t confidence interval in this case.
7.25 Fuel economy. Although the Environmental Protection Agency (EPA) establishes the tests to determine the fuel economy of new cars, it often does not perform them. Instead, the test protocols are given to the car companies, and the companies perform the tests themselves. To keep the industry honest, the EPA runs some spot checks each year. Recently, the EPA announced that Hyundai and Kia must lower their fuel economy estimates for many of their models.9 Here are some city miles per gallon (mpg) values for one of the models the EPA investigated:
MILEAGE
28.0 | 25.7 | 25.8 | 28.0 | 28.5 | 29.8 | 30.2 | 30.4 |
26.9 | 28.3 | 29.8 | 27.2 | 26.7 | 27.7 | 29.5 | 28.0 |
Give a 95% confidence interval for μ, the mean city mpg for this model.
7.26 Testing the sticker information. Refer to the previous exercise. The vehicle sticker information for this model stated a city average of 30 mpg. Are these mpg values consistent with the vehicle sticker? Perform a significance test using the 0.05 significance level. Be sure to specify the hypotheses, the test statistic, the P-value, and your conclusion.
MILEAGE
7.27 UberX driver earnings. On its blog, Uber posted a scatterplot using a sample of several thousand drivers in New York City. The plot shows each driver’s average net earnings per hour versus the number of hours worked.10 Here is a sample of earnings (dollars) for 27 drivers working 40 hours a week.
UBERX
26.25 | 33.51 | 43.91 | 31.91 | 31.78 | 43.37 | 36.66 | 31.69 | 31.25 |
46.86 | 35.44 | 40.30 | 30.93 | 37.80 | 42.44 | 43.80 | 49.64 | 36.79 |
34.10 | 37.54 | 30.93 | 38.40 | 37.83 | 21.73 | 41.62 | 26.25 | 33.51 |
(a) Do you think it is appropriate to use the t methods of this section to compute a 95% confidence interval for the average earnings per hour of New York City UberX drivers working 40 hours a week? Generate a plot to support your answer.
(b) Report the 95% confidence interval for μ, the average earnings per hour of New York City UberX drivers working 40 hours a week, as an estimate and margin of error.
(c) Report the 95% confidence interval for the average annual earnings of New York City UberX drivers working 40 hours a week.
(d) According to Uber, the median wage of an UberX driver working at least 40 hours in New York City is $90,766. Can these data be used to assess this claim? Explain your answer.
428
7.28 Number of friends on Facebook. To mark Facebook’s 10th birthday, Pew Research surveyed people using Facebook to see what they like and dislike about the site. The survey found that among adult Facebook users, the average number of friends is 338. This distribution takes only integer values, so it is certainly not Normal. It is also highly skewed to the right with a median of 200 friends.11 Consider the following SRS of Facebook users from your large university.
FACEFR
107 | 246 | 289 | 177 | 155 | 101 | 80 | 461 | 336 | 78 |
463 | 264 | 827 | 180 | 221 | 1065 | 79 | 691 | 70 | 921 |
126 | 672 | 296 | 60 | 11 | 227 | 84 | 787 | 18 | 82 |
(a) Are these data also heavily skewed? Use graphical methods to examine the distribution. Write a short summary of your findings.
(b) Do you think it is appropriate to use the t methods of this section to compute a 95% confidence interval for the mean number of friends that Facebook users at your large university have? Explain why or why not.
(c) Compute the sample mean and standard deviation, the standard error of the mean, and the margin of error for 95% confidence.
(d) Report the 95% confidence interval for μ, the average number of friends for Facebook users at your large university.
7.29 Alcohol content in beer. In February 2013, two California residents filed a class-action lawsuit against Anheuser-Busch, alleging the company was watering down beers to boost profits.12 They argued that because water was being added, the true alcohol content of the beer by volume is less than the advertised amount. For example, they alleged that Budweiser beer has an alcohol content by volume of 4.7% instead of the stated 5%. CNN, NPR, and a local St. Louis news team picked up on this suit and hired independent labs to test samples of Budweiser beer and find the alcohol content. Below is a summary of these tests each done on a single can.
BUD
4.94 | 5.00 | 4.99 |
(a) Even though we have a very small sample, test the null hypothesis that the alcohol content is 4.7% by volume. Do the data provide evidence against the claim of the two residents?
(b) Construct a 95% confidence interval for the true alcohol content in Budweiser.
(c) U.S. government standards require that the true alcohol content in all cans and bottles be within ±0.3% of the advertised level. Do these tests provide strong evidence that this is the case for Budweiser beer? Explain your answer.
7.30 Using the Internet on a computer. The Nielsen Company reported that U.S. residents aged 18 to 24 years spend an average of 32.5 hours per month using the Internet on a computer.13 You wonder if this it true for students at your large university because so many students use their smartphone to access the Internet. You collect an SRS of students and obtain hours with hours.
(a) Report the 95% confidence interval for μ, the average number of hours per month that students at your university use the Internet on a computer.
(b) Use this interval to test whether the average time for students at your university is different from the average reported by Nielsen. Use the 5% significance level. Summarize your results.
7.31 Rudeness and its effect on onlookers. Many believe that an uncivil environment has a negative effect on people. A pair of researchers performed a series of experiments to test whether witnessing rudeness and disrespect affects task performance.14 In one study, 34 participants met in small groups and witnessed the group organizer being rude to a “participant” who showed up late for the group meeting. After the exchange, each participant performed an individual brainstorming task in which he or she was asked to produce as many uses for a brick as possible in five minutes. The mean number of uses was 7.88 with a standard deviation of 2.35.
(a) Suppose that prior research has shown that the average number of uses a person can produce in five minutes under normal conditions is 10. Given that the researchers hypothesize that witnessing this rudeness will decrease performance, state the appropriate null and alternative hypotheses.
(b) Carry out the significance test using a significance level of 0.05. Give the P-value and state your conclusion.
429
7.32 Fuel efficiency t test. Computers in some vehicles calculate various quantities related to performance. One of these is the fuel efficiency, or gas mileage, usually expressed as miles per gallon (mpg). For one vehicle equipped in this way, the miles per gallon were recorded each time the gas tank was filled, and the computer was then reset.15 Here are the mpg values for a random sample of 20 of these records:
MPG
41.5 | 50.7 | 36.6 | 37.3 | 34.2 | 45.0 | 48.0 | 43.2 | 47.7 | 42.2 |
43.2 | 44.6 | 48.4 | 46.4 | 46.8 | 39.2 | 37.3 | 43.5 | 44.3 | 43.3 |
(a) Describe the distribution using graphical methods. Is it appropriate to analyze these data using methods based on Normal distributions? Explain why or why not.
(b) Find the mean, standard deviation, standard error, and margin of error for 95% confidence.
(c) Report the 95% confidence interval for μ, the mean miles per gallon for this vehicle based on these data.
7.33 Tree diameter confidence interval. A study of 584 longleaf pine trees in the Wade Tract in Thomas County, Georgia, is described in Example 6.1 (page 342). For each tree in the tract, the researchers measured the diameter at breast height (DBH). This is the diameter of the tree at a height of 4.5 feet, and the units are centimeters (cm). Only trees with DBH greater than 1.5 cm were sampled. Here are the diameters of a random sample of 40 of these trees:
PINES
10.5 | 13.3 | 26.0 | 18.3 | 52.2 | 9.2 | 26.1 | 17.6 | 40.5 | 31.8 |
47.2 | 11.4 | 2.7 | 69.3 | 44.4 | 16.9 | 35.7 | 5.4 | 44.2 | 2.2 |
4.3 | 7.8 | 38.1 | 2.2 | 11.4 | 51.5 | 4.9 | 39.7 | 32.6 | 51.8 |
43.6 | 2.3 | 44.6 | 31.5 | 40.3 | 22.3 | 43.3 | 37.5 | 29.1 | 27.9 |
(a) Use a histogram or stemplot and a boxplot to examine the distribution of DBHs. Include a Normal quantile plot if you have the necessary software. Write a careful description of the distribution.
(b) Is it appropriate to use the methods of this section to find a 95% confidence interval for the mean DBH of all trees in the Wade Tract? Explain why or why not.
(c) Report the mean with the margin of error and the confidence interval. Write a short summary describing the meaning of the confidence interval.
(d) Do you think these results would apply to other similar trees in the same area? Give reasons for your answer.
7.34 Nutritional intake among Canadian high-performance male athletes. Recall Exercise 6.74 (page 382). For one part of the study, male athletes from eight Canadian sports centers were surveyed. Their average caloric intake was 3077.0 kilocalories per day (kcal/d) with a standard deviation of 987.0. The recommended amount is 3421.7. Is there evidence that Canadian high-performance male athletes are deficient in their caloric intake?
(a) State the appropriate H0 and Ha to test this.
(b) Carry out the test, give the P-value, and state your conclusion.
(c) Construct a 95% confidence interval for the average deficiency in caloric intake.
7.35 Average number of Instagram posts. LocoWise provides social media analytics to companies and marketing agencies through a variety of online tools. One tool is the Instagram Analyzer, which allows a user to compare a profile with 2500 other Instagram profiles. Recently, it reported that the 2500 profiles it monitors averaged 2.55 posts per day, with a minimum value of 0 posts and a maximum value of 95 posts.16
(a) A common estimator of the standard deviation when provided the range R is . Compute this estimate of s for these data.
(b) Construct the 95% confidence interval for the average number of Instagram posts per day.
(c) These data are clearly skewed and possibly have a few outliers. Do you think it is appropriate to use the t procedures? Explain your answer.
7.36 Stress levels in parents of children with ADHD. In a study of parents who have children with attention-deficit/hyperactivity disorder (ADHD), parents were asked to rate their overall stress level using the Parental Stress Scale (PSS).17 This scale has 18 items that contain statements regarding both positive and negative aspects of parenthood. Respondents are asked to rate their agreement with each statement using a five-point scale (1 = strongly disagree to 5 = strongly agree). The scores are summed such that a higher score indicates greater stress. The mean rating for the 50 parents in the study was reported as 52.98 with a standard deviation of 10.34.
(a) Do you think that these data are approximately Normally distributed? Explain why or why not.
(b) Is it appropriate to use the methods of this section to compute a 90% confidence interval? Explain why or why not.
(c) Find the 90% margin of error and the corresponding confidence interval. Write a sentence explaining the interval and the meaning of the 90% confidence level.
(d) To recruit parents for the study, the researchers visited a psychiatric outpatient service in Rohtak, India, and selected 50 consecutive families who met the inclusion and exclusion criteria. To what extent do you think the results can be generalized to all parents with children who have ADHD in India or in other locations around the world?
430
7.37 Are the parents feeling extreme stress? Refer to the previous exercise. The researchers considered a score greater than 45 to represent extreme stress. Is there evidence that the average stress level for the parents in this study is above this level? Perform a test of significance using and summarize your results.
7.38 Food intake and weight gain. If we increase our food intake, we generally gain weight. Nutrition scientists can calculate the amount of weight gain that would be associated with a given increase in calories. In one study, 16 nonobese adults, aged 25 to 36 years, were fed 1000 calories per day in excess of the calories needed to maintain a stable body weight. The subjects maintained this diet for eight weeks, so they consumed a total of 56,000 extra calories.18 According to theory, 3500 extra calories will translate into a weight gain of 1 pound. Therefore, we expect each of these subjects to gain 56,000/3500 = 16 pounds (lb). Here are the weights before and after the eight-week period, expressed in kilograms (kg):
WTGAIN
Subject | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
Weight before | 55.7 | 54.9 | 59.6 | 62.3 | 74.2 | 75.6 | 70.7 | 53.3 |
Weight after | 61.7 | 58.8 | 66.0 | 66.2 | 79.0 | 82.3 | 74.3 | 59.3 |
Subject | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
Weight before | 73.3 | 63.4 | 68.1 | 73.7 | 91.7 | 55.9 | 61.7 | 57.8 |
Weight after | 79.1 | 66.0 | 73.4 | 76.9 | 93.1 | 63.0 | 68.2 | 60.3 |
(a) For each subject, subtract the weight before from the weight after to determine the weight change.
(b) Find the mean and the standard deviation for the weight change.
(c) Calculate the standard error and the margin of error for 95% confidence. Report the 95% confidence interval for weight change in a sentence that explains the meaning of the 95%.
(d) Convert the mean weight gain in kilograms to mean weight gain in pounds. Because there are 2.2 kg per pound, multiply the value in kilograms by 2.2 to obtain pounds. Do the same for the standard deviation and the confidence interval.
(e) Test the null hypothesis that the mean weight gain is 16 lb. Be sure to specify the null and alternative hypotheses, the test statistic with degrees of freedom, and the P-value. What do you conclude?
(f) Write a short paragraph explaining your results.
7.39 Food intake and NEAT. Nonexercise activity thermogenesis (NEAT) provides a partial explanation for the results you found in the previous analysis. NEAT is energy burned by fidgeting, maintenance of posture, spontaneous muscle contraction, and other activities of daily living. In the study of the previous exercise, the 16 subjects increased their NEAT by 328 calories per day, on average, in response to the additional food intake. The standard deviation was 256.
(a) Test the null hypothesis that there was no change in NEAT versus the two-sided alternative. Summarize the results of the test and give your conclusion.
(b) Find a 95% confidence interval for the change in NEAT. Discuss the additional information provided by the confidence interval that is not evident from the results of the significance test.
7.40 Potential insurance fraud? Insurance adjusters are concerned about the high estimates they are receiving from Jocko’s Garage. To see if the estimates are unreasonably high, each of 10 damaged cars was taken to Jocko’s and to another garage and the estimates (in dollars) were recorded. Here are the results:
JOCKO
Car | 1 | 2 | 3 | 4 | 5 |
Jocko’s | 1410 | 1550 | 1250 | 1300 | 900 |
Other | 1250 | 1300 | 1250 | 1200 | 950 |
Car | 6 | 7 | 8 | 9 | 10 |
Jocko’s | 1520 | 1750 | 3600 | 2250 | 2840 |
Other | 1575 | 1600 | 3380 | 2125 | 2600 |
(a) For each car, subtract the estimate of the other garage from Jocko’s estimate. Find the mean and the standard deviation for this difference.
(b) Test the null hypothesis that there is no difference between the estimates of the two garages. Be sure to specify the null and alternative hypotheses, the test statistic with degrees of freedom, and the P-value. What do you conclude using the 0.05 significance level?
(c) Construct a 95% confidence interval for the difference in estimates.
(d) The insurance company is considering seeking repayment from 1000 claims filed with Jocko’s last year. Using your answer to part (c), what repayment would you recommend the insurance company seek? Explain your answer.
7.41 Fuel efficiency comparison t test. Refer to Exercise 7.32. In addition to the computer calculating miles per gallon, the driver also recorded the miles per gallon by dividing the miles driven by the number of gallons at fill-up. The driver wants to determine if these calculations are different.
MPGDIFF
Fill-up | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
Computer | 41.5 | 50.7 | 36.6 | 37.3 | 34.2 | 45.0 | 48.0 | 43.2 | 47.7 | 42.2 |
Driver | 36.5 | 44.2 | 37.2 | 35.6 | 30.5 | 40.5 | 40.0 | 41.0 | 42.8 | 39.2 |
Fill-up | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 |
Computer | 43.2 | 44.6 | 48.4 | 46.4 | 46.8 | 39.2 | 37.3 | 43.5 | 44.3 | 43.3 |
Driver | 38.8 | 44.5 | 45.4 | 45.3 | 45.7 | 34.2 | 35.2 | 39.8 | 44.9 | 47.5 |
(a) State the appropriate H0 and Ha.
(b) Carry out the test using a significance level of 0.05. Give the P-value, and then interpret the result.
431
7.42 Counts of picks in a one-pound bag. A guitar supply company must maintain strict oversight on the number of picks they package for sale to customers. Their current advertisement specifies between 900 and 1000 picks in every bag. An SRS of 36 one-pound bags of picks was collected as part of a quality improvement effort within the company. The number of picks in each bag is shown in the following table.
PICKS
924 | 925 | 967 | 909 | 959 | 937 | 970 | 936 | 952 |
919 | 965 | 921 | 913 | 886 | 956 | 962 | 916 | 945 |
957 | 912 | 961 | 950 | 923 | 935 | 969 | 916 | 952 |
917 | 977 | 940 | 924 | 957 | 920 | 986 | 895 | 923 |
(a) Create (i) a histogram or stemplot, (ii) a boxplot, and (iii) a Normal quantile plot of these counts. Write a careful description of the distribution. Make sure to note any outliers, and comment on the skewness and Normality of the data.
(b) Based on your observations in part (a), is it appropriate to analyze these data using the t procedures? Briefly explain your response.
(c) Find the mean, the standard deviation, and the standard error of the mean for this sample.
(d) Calculate the 90% confidence interval for the mean number of picks in a one-pound bag.
7.43 Significance test for the average number of picks. Refer to the previous exercise.
PICKS
(a) Do these data provide evidence that the average number of picks in a one-pound bag is greater than 925? Using a significance level of 5%, state your hypotheses, the P-value, and your conclusions.
(b) Do these data provide evidence that the average number of picks in a one-pound bag is greater than 935? Using a significance level of 5%, state your hypotheses, the P-value, and your conclusion.
(c) Explain the relationship between your conclusions in parts (a) and (b) and the 90% confidence interval calculated in the previous problem.
7.44 A customer satisfaction survey. Many organizations are doing surveys to determine the satisfaction of their customers. Attitudes toward various aspects of campus life were the subject of one such study conducted at Purdue University. Each item was rated on a 1 to 5 scale, with 5 being the highest rating. The average response of 1568 first-year students to “Feeling welcomed at Purdue” was 3.83 with a standard deviation of 1.10. Assuming that the respondents are an SRS, give a 90% confidence interval for the mean of all first-year students.
7.45 Comparing operators of a DXA machine. Dual-energy X-ray absorptiometry (DXA) is a technique for measuring bone health. One of the most common measures is total body bone mineral content (TBBMC). A highly skilled operator is required to take the measurements. Recently, a new DXA machine was purchased by a research lab, and two operators were trained to take the measurements. TBBMC for eight subjects was measured by both operators.19 The units are grams (g). A comparison of the means for the two operators provides a check on the training they received and allows us to determine if one of the operators is producing measurements that are consistently higher than the other. Here are the data:
TBBMC
Subject | ||||||||
---|---|---|---|---|---|---|---|---|
Operator | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
1 | 1.328 | 1.342 | 1.075 | 1.228 | 0.939 | 1.004 | 1.178 | 1.286 |
2 | 1.323 | 1.322 | 1.073 | 1.233 | 0.934 | 1.019 | 1.184 | 1.304 |
(a) Take the difference between the TBBMC recorded for Operator 1 and the TBBMC for Operator 2. Describe the distribution of these differences. Is it appropriate to analyze these data using the t methods? Explain why or why not.
(b) Use a significance test to examine the null hypothesis that the two operators have the same mean. Be sure to give the test statistic with its degrees of freedom, the P-value, and your conclusion.
(c) The sample here is rather small, so we may not have much power to detect differences of interest. Use a 95% confidence interval to provide a range of differences that are compatible with these data.
(d) The eight subjects used for this comparison were not a random sample. In fact, they were friends of the researchers whose ages and weights were similar to those of the types of people who would be measured with this DXA machine. Comment on the appropriateness of this procedure for selecting a sample, and discuss any consequences regarding the interpretation of the significance-testing and confidence interval results.
432
7.46 Equivalence of paper and computer-based questionnaires. Computers are commonly being used to complete questionnaires because of the increased efficiency of data collection and reduction in coding errors. Studies, however, have shown that questionnaire format can influence responses, especially for items of a sensitive nature.20 Consider the small study below comparing paper and computer survey formats of a self-report measure of mental health. Each participant completed both forms on adjacent days with the order determined by a flip of a coin.
EQUIV
Subject | Paper | Computer | Diff | Subject | Paper | Computer | Diff |
---|---|---|---|---|---|---|---|
1 | 5 | 2 | 3 | 11 | 6 | 5 | 1 |
2 | 4 | 3 | 1 | 12 | 5 | 5 | 0 |
3 | 4 | 4 | 0 | 13 | 3 | 7 | −4 |
4 | 7 | 8 | −1 | 14 | 3 | 6 | −3 |
5 | 4 | 5 | −1 | 15 | 4 | 4 | 0 |
6 | 6 | 7 | −1 | 16 | 2 | 3 | −1 |
7 | 4 | 3 | 1 | 17 | 7 | 10 | −3 |
8 | 6 | 8 | −2 | 18 | 8 | 7 | 1 |
9 | 6 | 5 | 1 | 19 | 4 | 6 | −2 |
10 | 2 | 3 | −1 | 20 | 6 | 8 | −2 |
(a) Explain to someone unfamiliar with statistics why this experiment is a matched pairs design.
(b) The measure involves 10 items and produces a whole number score ranging between 0 and 20. Do you think it is appropriate to use the t procedures on the difference in survey scores? Explain your answer.
(c) Perform an equivalency test at the 0.05 level using the limits ±0.5 and state your conclusion.
7.47 Assessment of a foreign-language institute. The National Endowment for the Humanities sponsors summer institutes to improve the skills of high school teachers of foreign languages. One such institute hosted 20 French teachers for four weeks. At the beginning of the period, the teachers were given the Modern Language Association’s listening test of understanding of spoken French. After four weeks of immersion in French in and out of class, the listening test was given again. (The actual French spoken in the two tests was different, so that simply taking the first test should not improve the score on the second test.) The maximum possible score on the test is 36.21 Here are the data:
SUMLANG
Teacher | Pretest | Posttest | Gain | Teacher | Pretest | Posttest | Gain |
---|---|---|---|---|---|---|---|
1 | 32 | 34 | 2 | 11 | 30 | 36 | 6 |
2 | 31 | 31 | 0 | 12 | 20 | 26 | 6 |
3 | 29 | 35 | 6 | 13 | 24 | 27 | 3 |
4 | 10 | 16 | 6 | 14 | 24 | 24 | 0 |
5 | 30 | 33 | 3 | 15 | 31 | 32 | 1 |
6 | 33 | 36 | 3 | 16 | 30 | 31 | 1 |
7 | 22 | 24 | 2 | 17 | 15 | 15 | 0 |
8 | 25 | 28 | 3 | 18 | 32 | 34 | 2 |
9 | 32 | 26 | −6 | 19 | 23 | 26 | 3 |
10 | 20 | 26 | 6 | 20 | 23 | 26 | 3 |
To analyze these data, we first subtract the pretest score from the posttest score to obtain the improvement for each teacher. These 20 differences form a single sample. They appear in the “Gain” columns. The first teacher, for example, improved from 32 to 34, so the gain is 34 − 32 = 2.
(a) State appropriate null and alternative hypotheses for examining the question of whether or not the course improves French spoken-language skills.
(b) Describe the gain data. Use numerical and graphical summaries.
(c) Perform the significance test. Give the test statistic, the degrees of freedom, and the P-value. Summarize your conclusion.
(d) Give a 95% confidence interval for the mean improvement.