SECTION 2.5 Exercises

For Exercise 2.91, see page 106; for 2.92 and 2.93, see pages 106–107; for 2.94 to 2.96, see page 108; for 2.97 to 2.99, see page 109; and for 2.100 and 2.101, see pages 111112.

Question 2.102

2.102 Remote deposit capture

The Federal Reserve has called remote deposit capture (RDC) “the most important development the [U.S.] banking industry has seen in years.” This service allows users to scan checks and to transmit the scanned images to a bank for posting.16 In its annual survey of community banks, the American Bankers Association asked banks whether or not they offered this service.17 Here are the results classified by the asset size (in millions of dollars) of the bank:

rdc

Offer RDC
Asset size
($ in millions)
Yes No
Under $100 63 309
$101 to $200 59 132
$201 or more 112 85

Summarize the results of this survey question numerically and graphically. Write a short paragraph explaining the relationship between the size of a bank, measured by assets, and whether or not RDC is offered.

Question 2.103

2.103 How does RDC vary across the country?

The survey described in the previous exercise also classified community banks by region. Here is the table of counts:18 image

Offer RDC
Region Yes No
Northeast 28 38
Southeast 57 61
Central 53 84
Midwest 63 181
Southwest 27 51
West 61 76

Summarize the results of this survey question numerically and graphically. Write a short paragraph explaining the relationship between the location of a bank, measured by region, and whether or not remote deposit capture is offered.

2.103

Only 37% of all banks offer RDC. Regions with high percentages of banks offering RDC are Southeast (48.31%), West (44.53%), and Northeast (42.42%). Midwest (25.82%) has a low percentage of banks offering RDC.

Question 2.104

2.104 Exercise and adequate sleep

A survey of 656 boys and girls, ages 13 to 18, asked about adequate sleep and other health-related behaviors. The recommended amount of sleep is six to eight hours per night.19 In the survey, 54% of the respondents reported that they got less than this amount of sleep on school nights. The researchers also developed an exercise scale that was used to classify the students as above or below the median in how much they exercised. Here is the table of counts with students classified as getting or not getting adequate sleep and by the exercise variable:

sleep

114

Exercise
Enough sleep High Low
Yes 151 115
No 148 242
  1. Find the distribution of adequate sleep for the high exercisers.
  2. Do the same for the low exercisers.
  3. If you have the appropriate software, use a mosaic plot to illustrate the marginal distribution of exercise and your results in parts (a) and (b).
  4. Summarize the relationship between adequate sleep and exercise using the results of parts (a) and (b).

Question 2.105

2.105 Adequate sleep and exercise

Refer to the previous exercise.

sleep

  1. Find the distribution of exercise for those who get adequate sleep.
  2. Do the same for those who do not get adequate sleep.
  3. Write a short summary of the relationship between adequate sleep and exercise using the results of parts (a) and (b).
  4. Compare this summary with the summary that you obtained in part (c) of the previous exercise. Which do you prefer? Give a reason for your answer.

2.105

(a) For those who get enough sleep, 56.8% are high exercisers and 43.2% are low exercisers. (b) For those who don't get enough sleep, 37.9% are high exercisers and 62.1% are low exercisers. (c) Those who get enough sleep are more likely to be high exercisers than those who don't get enough sleep.

Question 2.106

2.106 Full-time and part-time college students

The Census Bureau provides estimates of numbers of people in the United States classified in various ways.20 Let's look at college students. The following table gives us data to examine the relation between age and full-time or part-time status. The numbers in the table are expressed as thousands of U.S. college students.

colstud

Status
Age Full-time Part-time
15–19 3388 389
20–24 5238 1164
25–34 1703 1699
35 and over 762 2045
  1. Find the distribution of age for full-time students.
  2. Do the same for the part-time students.
  3. Use the summaries in parts (a) and (b) to describe the relationship between full- or part-time status and age. Write a brief summary of your conclusions.

Question 2.107

2.107 Condition on age

Refer to the previous exercise.

colstud

  1. For each age group, compute the percent of students who are full-time and the percent of students who are part-time.
  2. Make a graphical display of the results that you found in part (a).
  3. If you have the appropriate software, make a mosaic plot.
  4. In a short paragraph, describe the relationship between age and full- or part-time status using your numerical and graphical summaries.
  5. Explain why you need only the percents of students who are full-time for your summary in part (b).
  6. Compare this way of summarizing the relationship between these two variables with what you presented in part (c) of the previous exercise.

2.107

(a) For Age 15 to 19: 89.7% are Full-time and 10.3% are Part-time. For Age 20 to 24: 81.82% are Full-time and 18.18% are Part-time. For Age 25 to 34: 50.06% are Full-time and 49.94% are Part-time. For Age 35 and Over: 27.15% are Full-time and 72.85% are Part-time. (d) Students aged 15–24 are much more likely to be Full-time, while students aged 35 and over and more likely to be Part-time. Students aged 25–34 are about equally likely to be Full- or Part-time students. (e) Because there are only 2 categories for Status, if we are given the percentage of Full-time students, the percentage of Part-time students must be 100% minus the percentage for Full-time. (f) Both are valid descriptions; it mostly depends on the condition in which you are interested. If we are interested in a particular age group, the current analysis likely has more meaning, whereas if we are interested in a particular status, the previous analysis has more meaning.

Question 2.108

2.108 Lying to a teacher

One of the questions in a survey of high school students asked about lying to teachers.21 The accompanying table gives the numbers of students who said that they lied to a teacher about something significant at least once during the past year, classified by sex.

lying

Sex
Lied at least once Male Female
Yes 6067 5966
No 4145 5719
  1. Add the marginal totals to the table.
  2. Calculate appropriate percents to describe the results of this question.
  3. Summarize your findings in a short paragraph.

Question 2.109

2.109 Trust and honesty in the workplace

The students surveyed in the study described in the previous exercise were also asked whether they thought trust and honesty were essential in business and the workplace. Here are the counts classified by sex:

trust

Sex
Trust and honesty are essential Male Female
Agree 9,097 10,935
Disagree 685 423

Answer the questions given in the previous exercise for this survey question.

2.109

There were 21,140 students total; 20,032 agree and 1,108 disagree; 11,358 female and 9,782 male. 96% of females and 93% of males agreed that trust and honesty are essential. A slightly higher percentage of females said that trust and honesty are essential.

115

Question 2.110

2.110 Class size and course level

College courses taught at lower levels often have larger class sizes. The following table gives the number of classes classified by course level and class size.22 For example, there were 202 first-year level courses with between one and nine students.

csize

Class size
Course
level
1–9 10–19 20–29 30–39 40–49 50–99 100 or
more
1 202 659 917 241 70 99 123
2 190 370 486 307 84 109 134
3 150 387 314 115 96 186 53
4 146 256 190 83 67 64 17
  1. Fill in the marginal totals in the table.
  2. Find the marginal distribution for the variable course level.
  3. Do the same for the variable class size.
  4. For each course level, find the conditional distribution of class size.
  5. Summarize your findings in a short paragraph.

Question 2.111

2.111 Hiring practices

A company has been accused of age discrimination in hiring for operator positions. Lawyers for both sides look at data on applicants for the past three years. They compare hiring rates for applicants younger than 40 years and those 40 years or older.

hiring

Age Hired Not hired
Younger than 40 82 1160
40 or older 2 168
  1. Find the two conditional distributions of hired/not hired—one for applicants who are less than 40 years old and one for applicants who are not less than 40 years old.
  2. Based on your calculations, make a graph to show the differences in distribution for the two age categories.
  3. Describe the company's hiring record in words. Does the company appear to discriminate on the basis of age?
  4. What lurking variables might be involved here?

2.111

(a) For younger than 40: 6.6% were hired, 93.4% were not. For 40 or older: 1.18% were hired, 98.82% were not. (c) The percentage of hired is greater for the younger than 40 group; the company looks like it is discriminating. (d) Education could be different among groups, making them more or less qualified.

Question 2.112

2.112 Nonresponse in a survey of companies

A business school conducted a survey of companies in its state. It mailed a questionnaire to 200 small companies, 200 medium-sized companies, and 200 large companies. The rate of nonresponse is important in deciding how reliable survey results are. Here are the data on response to this survey:

nresp

Small Medium Large
Response 124 80 41
No response 76 120 159
Total 200 200 200
  1. What was the overall percent of nonresponse?
  2. Describe how nonresponse is related to the size of the business. (Use percents to make your statements precise.)
  3. Draw a bar graph to compare the nonresponse percents for the three size categories.

Question 2.113

2.113 Demographics and new products

Companies planning to introduce a new product to the market must define the “target” for the product. Who do we hope to attract with our new product? Age and sex are two of the most important demographic variables. The following two-way table describes the age and marital status of American women.23 The table entries are in thousands of women.

agegen

Marital status
Age (years) Never
married
Married Widowed Divorced
18 to 24 12,112 2,171 23 164
25 to 39 9,472 18,219 177 2,499
40 to 64 5,224 35,021 2,463 8,674
984 9,688 8,699 2,412
  1. Find the sum of the entries for each column.
  2. Find the marginal distributions.
  3. Find the conditional distributions.
  4. If you have the appropriate software, make a mosaic plot.
  5. Write a short description of the relationship between marital status and age for women.

2.113

(a) 27,792 never married; 65,099 married; 11,362 widowed; 13,749 divorced. (b) Marginal distributions:

Percent Never-
Married
Married Widowed Divorced Total
18To24 10.26 1.84 0.02 0.14 12.3
25To39 8.03 15.44 0.15 2.12 25.7
40To64 4.43 29.68 2.09 7.35 43.5
65And-Over 0.83 8.21 7.37 2.04 18.5
Total 23.55 55.17 9.63 11.65 100

(c) Conditional distribution given Marital Status:

Percent Never-
Married
Married Widowed Divorced
18To24 43.58 3.33 0.2 1.19
25To39 34.08 27.99 1.56 18.18
40To64 18.8 53.8 21.68 63.09
65AndOver 3.54 14.88 76.56 17.54
Total 100 100 100 100

Conditional distribution given Age:

Percent Never-
Married
Married Widowed Divorced Total
18To24 83.7 15 0.16 1.13 100
25To39 31.19 60 0.58 8.23 100
40To64 10.17 68.16 4.79 16.88 100
65And-Over 4.52 44.48 39.93 11.07 100

(e) More than half of women are married; of that group, age 40 to 64 is the most common followed by 25 to 39. Almost 25% never married, but most of that group is represented by younger age groups. Widowed and Divorced have relatively small percentages across the board, though the 65 and Over group is most likely to be widowed and the 40 to 64 group is most likely to be divorced.

Question 2.114

2.114 Demographics, continued

  1. Using the data in the previous exercise, compare the conditional distributions of marital status for women aged 18 to 24 and women aged 40 to 64. Briefly describe the most important differences between the two groups of women, and back up your description with percents.
  2. Your company is planning a magazine aimed at women who have never been married. Find the conditional distribution of age among never-married women, and display it in a bar graph. What age group or groups should your magazine aim to attract?

agegen

Question 2.115

2.115 Demographics and new products—men

Refer to Exercises 2.113 and 2.114. Here are the corresponding counts for men:

agegen

116

Marital status
Age (years) Never
married
Married Widowed Divorced
18 to 24 13,509 1,245 6 63
25 to 39 12,685 16,029 78 1,790
40 to 64 6,869 34,650 760 6,647
685 12,514 2,124 1,464

Answer the questions from Exercises 2.113 and 2.114 for these counts.

2.115

33,748 never married; 64,438 married; 2,968 widowed; 9,964 divorced. Joint and marginal distributions:

Percent Never-
Married
Married Widowed Divorced Total
18To24 12.16 1.12 0.01 0.06 13.3
25To39 11.42 14.43 0.07 1.61 27.5
40To64 6.18 31.18 0.68 5.98 44
65And-Over 0.62 11.26 1.91 1.32 15.1
Total 30.37 57.99 2.67 8.97 100

Conditional distribution given Marital Status:

Percent Never-
Married
Married Widowed Divorced
18To24 40.03 1.93 0.2 0.63
25To39 37.59 24.88 2.63 17.96
40To64 20.35 53.77 25.61 66.71
65And-Over 2.03 19.42 71.56 14.69
Total 100 100 100 100

Conditional distribution given Age:

Percent Never-
Married
Married Widowed Divorced Total
18To24 91.14 8.4 0.04 0.43 100
25To39 41.48 52.41 0.26 5.85 100
40To64 14.04 70.82 1.55 13.59 100
65AndOver 4.08 74.55 12.65 8.72 100

More than half of men are married; of that group, age 40 to 64 is the most common followed by 25 to 39 and 65 and Over. More than 30% never married, very few of which are 65 and Over. Fewer than 3% of men are widowed, and the vast majority are 65 and Over. About 9% are divorced, two-thirds in the 40 to 64 age group.

Question 2.116

2.116 Discrimination?

Wabash Tech has two professional schools, business and law. Here are two-way tables of applicants to both schools, categorized by sex and admission decision. (Although these data are made up, similar situations occur in reality.)

disc

Business
Admit Deny
Male 480 120
Female 180 20
Law
Admit Deny
Male 10 90
Female 100 200
  1. Make a two-way table of sex by admission decision for the two professional schools together by summing entries in these tables.
  2. From the two-way table, calculate the percent of male applicants who are admitted and the percent of female applicants who are admitted. Wabash admits a higher percent of male applicants.
  3. Now compute separately the percents of male and female applicants admitted by the business school and by the law school. Each school admits a higher percent of female applicants.
  4. This is Simpson's paradox: both schools admit a higher percent of the women who apply, but overall, Wabash admits a lower percent of female applicants than of male applicants. Explain carefully, as if speaking to a skeptical reporter, how it can happen that Wabash appears to favor males when each school individually favors females.

Question 2.117

2.117 Obesity and health

Recent studies have shown that earlier reports underestimated the health risks associated with being overweight. The error was due to lurking variables. In particular, smoking tends both to reduce weight and to lead to earlier death. Illustrate Simpson's paradox by a simplified version of this situation. That is, make up tables of overweight (yes or no) by early death (yes or no) by smoker (yes or no) such that

  • Overweight smokers and overweight nonsmokers both tend to die earlier than those not overweight.
  • But when smokers and nonsmokers are combined into a two-way table of overweight by early death, persons who are not overweight tend to die earlier.

Question 2.118

2.118 Find the table

Here are the row and column totals for a two-way table with two rows and two columns:

60
60
70 50 120

Find two different sets of counts , , , and for the body of the table that give these same totals. This shows that the relationship between two variables cannot be obtained from the two individual distributions of the variables.