SECTION 2.5 Exercises

For Exercise 2.91, see page 106; for 2.92 and 2.93, see pages 106–107; for 2.94 to 2.96, see page 108; for 2.97 to 2.99, see page 109; and for 2.100 and 2.101, see pages 111–112.

Question 2.102

2.102 Remote deposit capture

The Federal Reserve has called remote deposit capture (RDC) “the most important development the [U.S.] banking industry has seen in years.” This service allows users to scan checks and to transmit the scanned images to a bank for posting.16 In its annual survey of community banks, the American Bankers Association asked banks whether or not they offered this service.17 Here are the results classified by the asset size (in millions of dollars) of the bank:

Offer RDC
Asset size
($ in millions)
Yes No
Under $100 63 309
$101 to $200 59 132
$201 or more 112 s85

Summarize the results of this survey question numerically and graphically. Write a short paragraph explaining the relationship between the size of a bank, measured by assets, and whether or not RDC is offered.

Question 2.103

2.103 How does RDC vary across the country?

The survey described in the previous exercise also classified community banks by region. Here is the 6 × 2 table of counts:18

Offer RDC
Region Yes No
Northeast 28 38
Southeast 57 61
Central 53 84
Midwest 63 181
Southwest 27 51
West 61 76

Summarize the results of this survey question numerically and graphically. Write a short paragraph explaining the relationship between the location of a bank, measured by region, and whether or not remote deposit capture is offered.

Question 2.104

2.104 Exercise and adequate sleep

A survey of 656 boys and girls, ages 13 to 18, asked about adequate sleep and other health-related behaviors. The recommended amount of sleep is six to eight hours per night.19 In the survey, 54% of the respondents reported that they got less than this amount of sleep on school nights. The researchers also developed an exercise scale that was used to classify the students as above or below the median in how much they exercised. Here is the 2 × 2 table of counts with students classified as getting or not getting adequate sleep and by the exercise variable:

114

Exercise
Enough sleep High Low
Yes 151 115
No 148 242
  1. Find the distribution of adequate sleep for the high exercisers.
  2. Do the same for the low exercisers.
  3. If you have the appropriate software, use a mosaic plot to illustrate the marginal distribution of exercise and your results in parts (a) and (b).
  4. Summarize the relationship between adequate sleep and exercise using the results of parts (a) and (b).

Question 2.105

2.105 Adequate sleep and exercise

Refer to the previous exercise.

  1. Find the distribution of exercise for those who get adequate sleep.
  2. Do the same for those who do not get adequate sleep.
  3. Write a short summary of the relationship between adequate sleep and exercise using the results of parts (a) and (b).
  4. Compare this summary with the summary that you obtained in part (c) of the previous exercise. Which do you prefer? Give a reason for your answer.

Question 2.106

2.106 Full-time and part-time college students

The Census Bureau provides estimates of numbers of people in the United States classified in various ways.20 Let’s look at college students. The following table gives us data to examine the relation between age and full-time or part-time status. The numbers in the table are expressed as thousands of U.S. college students.

Status
Age Full-time Part-time
15–19 3388 389
20–24 5238 1164
25–34 1703 1699
35 and over 762 2045
  1. Find the distribution of age for full-time students.
  2. Do the same for the part-time students.
  3. Use the summaries in parts (a) and (b) to describe the relationship between full- or part-time status and age. Write a brief summary of your conclusions.

Question 2.107

2.107 Condition on age

Refer to the previous exercise.

  1. For each age group, compute the percent of students who are full-time and the percent of students who are part-time.
  2. Make a graphical display of the results that you found in part (a).
  3. If you have the appropriate software, make a mosaic plot.
  4. In a short paragraph, describe the relationship between age and full- or part-time status using your numerical and graphical summaries.
  5. Explain why you need only the percents of students who are full-time for your summary in part (b).
  6. Compare this way of summarizing the relationship between these two variables with what you presented in part (c) of the previous exercise.

Question 2.108

2.108 Lying to a teacher

One of the questions in a survey of high school students asked about lying to teachers.21 The accompanying table gives the numbers of students who said that they lied to a teacher about something significant at least once during the past year, classified by gender.

Gender
Lied at least once Male Female
Yes 6067 5966
No 4145 5719
  1. Add the marginal totals to the table.
  2. Calculate appropriate percents to describe the results of this question.
  3. Summarize your findings in a short paragraph.

Question 2.109

2.109 Trust and honesty in the workplace

The students surveyed in the study described in the previous exercise were also asked whether they thought trust and honesty were essential in business and the workplace. Here are the counts classified by gender:

Gender
Trust and honesty are essential Male Female
Agree 9,097 10,935
Disagree 685 423

Answer the questions given in the previous exercise for this survey question.

115

Question 2.110

2.110 Class size and course level

College courses taught at lower levels often have larger class sizes. The following table gives the number of classes classified by course level and class size.22 For example, there were 202 first-year level courses with between one and nine students.

Class size
Course
level
1–9 10–19 20–29 30–39 40–49 50–99 100 or
more
1 202 659 917 241 70 99 123
2 190 370 486 307 84 109 134
3 150 387 314 115 96 186 53
4 146 256 190 83 67 64 17
  1. Fill in the marginal totals in the table.
  2. Find the marginal distribution for the variable course level.
  3. Do the same for the variable class size.
  4. For each course level, find the conditional distribution of class size.
  5. Summarize your findings in a short paragraph.

Question 2.111

2.111 Hiring practices

A company has been accused of age discrimination in hiring for operator positions. Lawyers for both sides look at data on applicants for the past three years. They compare hiring rates for applicants younger than 40 years and those 40 years or older.

Age Hired Not hired
Younger than 40 82 1160
40 or older 2 168
  1. Find the two conditional distributions of hired/not hired—one for applicants who are less than 40 years old and one for applicants who are not less than 40 years old.
  2. Based on your calculations, make a graph to show the differences in distribution for the two age categories.
  3. Describe the company’s hiring record in words. Does the company appear to discriminate on the basis of age?
  4. What lurking variables might be involved here?

Question 2.112

2.112 Nonresponse in a survey of companies

A business school conducted a survey of companies in its state. It mailed a questionnaire to 200 small companies, 200 medium-sized companies, and 200 large companies. The rate of nonresponse is important in deciding how reliable survey results are. Here are the data on response to this survey:

Small Medium Large
Response 124 80 41
No response 76 120 159
Total 200 200 200
  1. What was the overall percent of nonresponse?
  2. Describe how nonresponse is related to the size of the business. (Use percents to make your statements precise.)
  3. Draw a bar graph to compare the nonresponse percents for the three size categories.

Question 2.113

2.113 Demographics and new products

Companies planning to introduce a new product to the market must define the “target” for the product. Who do we hope to attract with our new product? Age and gender are two of the most important demographic variables. The following two-way table describes the age and marital status of American women.23 The table entries are in thousands of women.

Marital status
Age (years) Never
married
Married Widowed Divorced
18 to 24 12,112 2,171 23 164
25 to 39 9,472 18,219 177 2,499
40 to 64 5,224 35,021 2,463 8,674
≥ 65 984 9,688 8,699 2,412
  1. Find the sum of the entries for each column.
  2. Find the marginal distributions.
  3. Find the conditional distributions.
  4. If you have the appropriate software, make a mosaic plot.
  5. Write a short description of the relationship between marital status and age for women.

Question 2.114

2.114 Demographics, continued

  1. Using the data in the previous exercise, compare the conditional distributions of marital status for women aged 18 to 24 and women aged 40 to 64. Briefly describe the most important differences between the two groups of women, and back up your description with percents.
  2. Your company is planning a magazine aimed at women who have never been married. Find the conditional distribution of age among never-married women, and display it in a bar graph. What age group or groups should your magazine aim to attract?

Question 2.115

2.115 Demographics and new products—men

Refer to Exercises 2.113 and 2.114. Here are the corresponding counts for men:

116

Marital status
Age (years) Never
married
Married Widowed Divorced
18 to 24 13,509 1,245 6 63
25 to 39 12,685 16,029 78 1,790
40 to 64 6,869 34,650 760 6,647
≥ 65 685 12,514 2,124 1,464

Answer the questions from Exercises 2.113 and 2.114 for these counts.

Question 2.116

2.116 Discrimination?

Wabash Tech has two professional schools, business and law. Here are two-way tables of applicants to both schools, categorized by gender and admission decision. (Although these data are made up, similar situations occur in reality.)

Business
Admit Deny
Male 480 120
Female 180 20
Law
Admit Deny
Male 10 90
Female 100 200
  1. Make a two-way table of gender by admission decision for the two professional schools together by summing entries in these tables.
  2. From the two-way table, calculate the percent of male applicants who are admitted and the percent of female applicants who are admitted. Wabash admits a higher percent of male applicants.
  3. Now compute separately the percents of male and female applicants admitted by the business school and by the law school. Each school admits a higher percent of female applicants.
  4. This is Simpson’s paradox: both schools admit a higher percent of the women who apply, but overall, Wabash admits a lower percent of female applicants than of male applicants. Explain carefully, as if speaking to a skeptical reporter, how it can happen that Wabash appears to favor males when each school individually favors females.

Question 2.117

2.117 Obesity and health

Recent studies have shown that earlier reports underestimated the health risks associated with being overweight. The error was due to lurking variables. In particular, smoking tends both to reduce weight and to lead to earlier death. Illustrate Simpson’s paradox by a simplified version of this situation. That is, make up tables of overweight (yes or no) by early death (yes or no) by smoker (yes or no) such that

  • Overweight smokers and overweight nonsmokers both tend to die earlier than those not overweight.
  • But when smokers and nonsmokers are combined into a two-way table of overweight by early death, persons who are not overweight tend to die earlier.

Question 2.118

2.118 Find the table

Here are the row and column totals for a two-way table with two rows and two columns:

a b 60
c d 60
70 50 120

Find two different sets of counts a, b, c, and d for the body of the table that give these same totals. This shows that the relationship between two variables cannot be obtained from the two individual distributions of the variables.