section 3.1
CDC Funding. The following table contains the funding provided by the Centers for Disease Control (CDC) to all the states in New England, in order to fight HIV/AIDS. Use the data for Exercises 1–3.
State | Funding ($ millions) |
---|---|
Connecticut | 7.8 |
Maine | 1.9 |
Massachusetts | 14.9 |
New Hampshire | 1.5 |
Rhode Island | 2.7 |
Vermont | 1.6 |
1. Find the mean.
3.99.1
$5.07 million
2. Calculate the median.
3. Suppose we added California, with $62.1 million in funding, to the data set. Recompute the mean and the median. Which is more affected by the presence of California? What can we say about each of the mean and the median, with respect to extreme values?
3.99.3
Mean = $13.21 million, Median = $2.7 million. The mean. The mean is affected by extreme values more than the median is.
Calories in Cereal. For Exercises 4–8, refer to the calories in breakfast cereals given in Table 23 (page 169).
4. Compute the mean.
5. Calculate the median.
3.99.5
110 calories
6. Find the mode
7. If we eliminated the cereals with 90 or less calories from the sample, which measure would not be affected at all? Why?
3.99.7
The mode, since the value with the largest frequency is unaffected by the deletion of values 90 or less.
8. If we added 10 calories to each cereal, how would that affect the mean, median, and mode? Would it affect each of the measures equally?
section 3.2
CDC Funding to Fight HIV/AIDS. Refer to the CDC funding data above for Exercises 9–14. Omit California.
9. Find the range of the data set.
3.99.9
$13.4 million
10. For each state, find its deviation from the population mean.
11. Calculate the average deviation. Would the average deviation be a good measure of spread? Why or why not?
3.99.11
No. It is always 0.
12. Compute the sum of squared deviations. Then divide by the number of states. The result is the population variance, .
13. Take the square root of the population variance to find the population standard deviation, .
3.99.13
$4.9080 million.
184
14. Interpret the value for the standard deviation.
Calories in Cereal. For Exercises 14–17, refer to the calories in breakfast cereals given in Table 23 (page 169).
15. Calculate the standard deviation of the sample.
3.99.15
11.6450 calories
16. Suppose we consider the cereals in Table 23 to be representative of all breakfast cereals. Use the mean from Exercise 4 and the standard deviation from Exercise 15, along with Chebyshev's Rule, to find two values between which at least 75% of cereal calories will fall.
17. Refer to the previous exercise. Now further assume the data distribution is bell-shaped. Find two values between which about 95% of cereal calories will fall.
3.99.17
85.88 calories, 132.46 calories
Common Syllables in English. Refer to the table shown here of some common syllables in English for Exercises 18–21.
Syllable | Frequency |
---|---|
an | 462 |
bi | 621 |
sit | 104 |
ed | 907 |
its | 293 |
est | 186 |
wil | 470 |
tiv | 136 |
en | 675 |
biz | 114 |
syllables
18. Find the mean and the range of the syllable frequencies.
syllables
19. Would you say that a typical distance from the mean for the frequencies is about 900, about 500, about 300, or about 100?
3.99.19
About 300.
syllables
20. What is your best guesstimate of the value of a typical distance from the mean for the syllable frequencies?
syllables
21. Find the sample variance and the sample standard deviation of syllable frequencies.
3.99.21
Standard deviation = 276.2. (a) 24 (b) The frequency counts for the syllables typically differ from the mean of 396.8 by only 276.2.
section 3.3
Age Distribution of Twenty-Somethings. The following table shows the number of Americans (in millions) between 20 and 29 years old in 2011. Use this data for Exercises 22–25.
22. Find the estimated mean age of twenty-somethings.
23. Calculate the estimated standard deviation of Americans in their 20s.
3.99.23
2.89 years
24. Use the Empirical Rule to find two age values between which fall about 68% of all American twenty-somethings.
ages20s
25. Compare your answer in the previous exercise to the actual proportion of twenty-somethings whose ages lie between the values found in the previous exercise. What does this discrepancy mean, regarding the distribution of ages in the table?
Age | Number (millions) |
---|---|
20 | 4.5 |
21 | 4.4 |
22 | 4.3 |
23 | 4.2 |
24 | 4.2 |
25 | 4.3 |
26 | 4.2 |
27 | 4.2 |
28 | 4.2 |
29 | 4.2 |
3.99.25
59.48%. The distribution is not bell-shaped.
section 3.4
Ragweed Pollen. Use the table of ragweed pollen index in New York localities for Exercises 26–41. Are you allergic to ragweed pollen? You are not alone. The American Academy of Allergy maintains the ragweed pollen index, which details the severity of the pollen problem for hundreds of communities across the nation. The following table contains the ragweed pollen index on a particular day for 10 localities in New York State.
Locality | Ragweed pollen index |
---|---|
Albany | 48 |
Binghamton | 31 |
Buffalo | 59 |
Elmira | 43 |
Manhattan | 25 |
Rochester | 60 |
Syracuse | 25 |
Tupper Lake | 8 |
Utica | 26 |
Yonkers | 38 |
Find the following percentiles of total ragweed pollen index.
ragweed
26. 10th percentile
ragweed
27. 50th percentile
3.99.27
34.5
ragweed
28. 90th percentile
For Exercises 29–31, find the -scores for the following localities for the ragweed pollen index.
ragweed
29. Albany
3.99.29
0.7088
ragweed
30. Rochester
ragweed
31. Tupper Lake
3.99.31
–1.7145
ragweed
32. Identify any outliers or moderately unusual observations in the ragweed pollen index.
For Exercises 33–35, find the percentile rank for the given ragweed pollen index.
ragweed
33. 25
3.99.33
30th percentile
ragweed
34. 59
ragweed
35. 48
3.99.35
80th percentile
ragweed
36. Find the first, second, and third quartiles of the ragweed pollen index.
185
ragweed
37. Find the interquartile range. Interpret what this value means.
3.99.37
IQR = 23, which is the spread of the middle 50% of the data set
ragweed
38. Detect any outliers using the IQR method.
section 3.5
ragweed
39. Let's draw a boxplot of the ragweed pollen index.
3.99.39
(a) Min = 8, Q1 = 25, median = 34.5, Q3 = 48, max = 60
(b)
(c) Close to symmetric, slightly right-skewed. (d) The mean should be close to the median or a little above the median. (e) Mean = 36.30, standard deviation = 16.51. The value of the mean is slightly above the median of 34.50.
ragweed
40. Detect any outliers using the IQR method. Compare with Exercise 32. Do the two methods concur or disagree?
ragweed
41. Suppose the ragweed pollen index in Rochester were 600 instead of 60. How would this outlier affect the quartiles and the IQR? What property of these measures is this behavior an example of?
3.99.41
Unchanged; robustness