SECTION 2.2 Exercises

For Exercises 2.23 and 2.24, see page 75; and for 2.25 and 2.26, see pages 7778.

Question 2.27

2.27 Companies of the world

Refer to Exercise 1.118 (page 61), where we examined data collected by the World Bank on the numbers of companies that are incorporated and are listed on their country's stock exchange at the end of the year. In Exercise 2.10 (page 71), you examined the relationship between these numbers for 2012 and 2002.

inccom

  1. Find the correlation between these two variables.
  2. Do you think that the correlation you computed gives a good numerical summary of the strength of the relationship between these two variables? Explain your answer.

2.27

(a) . (b) Yes, there is a very strong linear relationship between the 2002 and 2012 data.

Question 2.28

2.28 Companies of the world

Refer to the previous exercise and to Exercise 2.11 (page 72). Answer parts (a) and (b) for 2012 and 1992. Compare the correlation you found in the previous exercise with the one you found in this exercise. Why do they differ in this way?

inccom

Question 2.29

2.29 A product for lab experiments

In Exercise 2.17 (page 73), you described the relationship between time and count for an experiment examining the decay of barium.

decay

  1. Is the relationship between these two variables strong? Explain your answer.
  2. Find the correlation.
  3. Do you think that the correlation you computed gives a good numerical summary of the strength of the relationship between these two variables? Explain your answer.

2.29

(a) Yes, the data points form a nice curve. (b) . (c) No, the data shows a curve, not a line; a transformation is needed to get a better relationship.

Question 2.30

2.30 Use a log for the radioactive decay

Refer to the previous exercise and to Exercise 2.18 (page 73), where you transformed the counts with a logarithm.

decay

  1. Is the relationship between time and the log of the counts strong? Explain your answer.
  2. Find the correlation between time and the log of the counts.
  3. Do you think that the correlation you computed gives a good numerical summary of the strength of the relationship between these two variables? Explain your answer.
  4. Compare your results here with those you found in the previous exercise. Was the correlation useful in explaining the relationship before the transformation? After? Explain your answers.
  5. Using your answer in part (d), write a short explanation of what these analyses show about the use of a correlation to explain the strength of a relationship.

79

Question 2.31

2.31 Brand-to-brand variation in a product

In Exercise 2.12 (page 73), you examined the relationship between percent alcohol and calories per 12 ounces for 175 domestic brands of beer.

beer

  1. Compute the correlation between these two variables.
  2. Do you think that the correlation you computed gives a good numerical summary of the strength of the relationship between these two variables? Explain your answer.

2.31

(a) . (b) Yes, the relationship between percent alcohol and calories is quite linear, so the correlation gives a good numerical summary of the relationship.

Question 2.32

2.32 Alcohol and carbohydrates in beer revisited

Refer to the previous exercise. Delete any outliers that you identified in Exercise 2.12.

beer

  1. Recompute the correlation without the outliers.
  2. Write a short paragraph about the possible effects of outliers on the correlation, using this example to illustrate your ideas.

Question 2.33

2.33 Marketing in Canada

In Exercise 2.14 (page 73), you examined the relationship between the percent of the population over 65 and the percent under 15 for the 13 Canadian provinces and territories.

canadap

  1. Make a scatterplot of the two variables if you do not have your work from Exercise 2.14.
  2. Find the value of the correlation .
  3. Does this numerical summary give a good indication of the strength of the relationship between these two variables? Explain your answer.

2.33

(b) . (c) No, although the relationship is mostly linear, there is an outlier, Nunavut, with a high percent of under 15 and a very low percent of over 65.

Question 2.34

2.34 Nunavut

Refer to the previous exercise.

canadap

  1. Do you think that Nunavut is an outlier? Explain your answer.
  2. Find the correlation without Nunavut. Using your work from the previous exercise, summarize the effect of Nunavut on the correlation.

Question 2.35

2.35 Education spending and population with logs

In Example 2.3 (page 66), we examined the relationship between spending on education and population, and in Exercise 2.23 (page 75), you found the correlation between these two variables. In Example 2.6 (page 69), we examined the relationship between the variables transformed by logs.

edspend

  1. Compute the correlation between the variables expressed as logs.
  2. How does this correlation compare with the one you computed in Exercise 2.23? Discuss this result.

2.35

(a) . (b) The correlation went up from 0.9798 before taking the logs to 0.9808 after. Although the correlation went up a little bit, the log didn't help much with the explanation of the data.

Question 2.36

2.36 Are they outliers?

Refer to the previous exercise. Delete the four states with high values.

edspend

  1. Find the correlation between spending on education and population for the remaining 46 states.
  2. Do the same for these variables expressed as logs.
  3. Compare your results in parts (a) and (b) with the correlations that you computed with the full data set in Exercise 2.23 and in the previous exercise. Discuss these results.

Question 2.37

2.37 Fuel efficiency and CO2 emissions

In Example 2.7 (pages 7071), we examined the relationship between highway MPG and CO2 emissions for 1067 vehicles for the model year 2014. Let's examine the relationship between the two measures of fuel efficiency in the data set, highway MPG and city MPG.

canfuel

  1. Make a scatterplot with city MPG on the x axis and highway MPG on the y axis.
  2. Describe the relationship.
  3. Calculate the correlation.
  4. Does this numerical summary give a good indication of the strength of the relationship between these two variables? Explain your answer.

2.37

(b) The relationship is somewhat linear but may also be slightly curved. Hwy MPG and City MPG increase together. (c) . (d) The correlation is a decent numerical summary because the data are somewhat linear, but a curve may provide a better description of the relationship.

Question 2.38

2.38 Consider the fuel type

Refer to the previous exercise and to Figure 2.6 (page 71), where different colors are used to distinguish four different types of fuels used by these vehicles.

canfuel

  1. Make a figure similar to Figure 2.6 that allows us to see the categorical variable, type of fuel, in the scatterplot. If your software does not have this capability, make different scatterplots for each fuel type.
  2. Discuss the relationship between highway MPG and city MPG, taking into account the type of fuel. Compare this view with what you found in the previous exercise where you did not make this distinction.
  3. Find the correlation between highway MPG and city MPG for each type of fuel. Write a short summary of what you have found.

Question 2.39

2.39 Match the correlation

The Correlation and Regression applet at the text website allows you to create a scatterplot by clicking and dragging with the mouse. The applet calculates and displays the correlation as you change the plot. You will use this applet to make scatterplots with 10 points that have correlation close to 0.7. The lesson is that many patterns can have the same correlation. Always plot your data before you trust a correlation.

  1. Stop after adding the first two points. What is the value of the correlation? Why does it have this value?
  2. Make a lower-left to upper-right pattern of 10 points with correlation about . (You can drag points up or down to adjust after you have 10 points.) Make a rough sketch of your scatterplot.
  3. Make another scatterplot with nine points in a vertical stack at the right of the plot. Add one point far to the left and move it until the correlation is close to 0.7. Make a rough sketch of your scatterplot.
  4. Make yet another scatterplot with 10 points in a curved pattern that starts at the lower left, rises to the right, then falls again at the far right. Adjust the points up or down until you have a quite smooth curve with correlation close to 0.7. Make a rough sketch of this scatterplot also.

80

Question 2.40

2.40 Stretching a scatterplot

Changing the units of measurement can greatly alter the appearance of a scatterplot. Consider the following data:

stretch

3 4 4
0.5 0.5 0.5
  1. Draw x and y axes each extending from to 6. Plot the data on these axes.
  2. Calculate the values of new variables and , starting from the values of and . Plot against on the same axes using a different plotting symbol. The two plots are very different in appearance.
  3. Find the correlation between and . Then find the correlation between and . How are the two correlations related? Explain why this isn't surprising.

Question 2.41

2.41 CEO compensation and stock market performance

An academic study concludes, “The evidence indicates that the correlation between the compensation of corporate CEOs and the performance of their company's stock is close to zero.” A business magazine reports this as “A new study shows that companies that pay their CEOs highly tend to perform poorly in the stock market, and vice versa.” Explain why the magazine's report is wrong. Write a statement in plain language (don't use the word “correlation”) to explain the study's conclusion.

2.41

The magazine report is wrong because they are interpreting a correlation close to 0 as a negative association rather than no association.

Question 2.42

2.42 Investment reports and correlations

Investment reports often include correlations. Following a table of correlations among mutual funds, a report adds, “Two funds can have perfect correlation, yet different levels of risk. For example, Fund A and Fund B may be perfectly correlated, yet Fund A moves 20% whenever Fund B moves 10%.” Write a brief explanation, for someone who does not know statistics, of how this can happen. Include a sketch to illustrate your explanation.

Question 2.43

2.43 Sloppy writing about correlation

Each of the following statements contains a blunder. Explain in each case what is wrong.

  1. “The correlation between and is but the correlation between and is .”
  2. “There is a high correlation between the color of a smartphone and the age of its owner.”
  3. “There is a very high correlation () between the premium you would pay for a standard automobile insurance policy and the number of accidents you have had in the last three years.”

2.43

(a) The correlation is not dependent on order and remains the same between two variables regardless of order. (b) A correlation is reserved for quantitative data; because color is categorical, it cannot have any correlation. (c) A correlation can never exceed 1, which indicates a perfect linear relationship.