Like our test for a population proportion, the chi-
Cell counts required for the chi-
You can safely use the chi-
The cocaine study easily passes this test: all the expected cell counts are either 8 or 16. Here is a concluding example that outlines the examination of a two-
EXAMPLE 6 Do angry people have a greater incidence of heart disease?
People who get angry easily tend to have more heart disease. That’s the conclusion of a study that followed a random sample of 12,986 people from three locations for about four years. All subjects were free of heart disease at the beginning of the study. The subjects took the Spielberger Trait Anger Scale test, which measures how prone a person is to sudden anger. Here are data for the 8474 people in the sample who had normal blood pressure. CHD stands for “coronary heart disease.’’ This includes people who had heart attacks and those who needed medical treatment for heart disease.
Anger Score | |||
Low | Moderate | High | |
Sample size | 3110 | 4731 | 633 |
CHD count | 53 | 110 | 27 |
CHD percent | 1.7% | 2.3% | 4.3% |
There is a clear trend: as the anger score increases, so does the percentage who suffer heart disease. Is this relationship between anger and heart disease statistically significant?
The first step is to write the data as a two-
Low anger | Moderate anger | High anger | Total | |
---|---|---|---|---|
CHD | 53 | 110 | 27 | 190 |
No CHD | 3057 | 4621 | 606 | 8284 |
Total | 3110 | 4731 | 633 | 8474 |
We can now follow the steps for a significance test, familiar from Chapter 22.
The hypotheses. The chi-
H0: no association between anger and CHD
Ha: some association between anger and CHD
The sampling distribution. We will see that all the expected cell counts are larger than 5, so we can safely apply the chi-
The data. First find the expected cell counts. For example, the expected count of high-
expected count=row 1 total×column 3 totaltable total=(190)(633)8474=14.19
Here is the complete table of observed and expected counts side by side:
Observed | Expected | |||||
---|---|---|---|---|---|---|
Low | Moderate | High | Low | Moderate | High | |
CHD | 53 | 110 | 27 | 69.73 | 106.08 | 14.19 |
No CHD | 3057 | 4621 | 606 | 3040.27 | 4624.92 | 618.81 |
Looking at these counts, we see that the high-
χ2=(53−69.73)269.73+(110−106.08)2106.08+(27−14.19)214.19+(3057−3040.27)23040.27+(4621−4624.92)24624.92+(606−618.81)2618.81=4.014+0.145+11.564+0.092+0.003+0.265=16.083
In practice, statistical software can do all this arithmetic for you. Look at the six terms that we sum to get χ2. Most of the total comes from just one cell: high-
Significance? Look at the df = 2 line of Table 24.1. The observed chi-
The conclusion. Can we conclude that proneness to anger causes heart disease? This is an observational study, not an experiment. It isn’t surprising to find that some lurking variables are confounded with anger. For example, people prone to anger are more likely than others to be those who drink and smoke. The study report used advanced statistics to adjust for many differences among the three anger groups. The adjustments raised the P-value from P=0.0003 to P=0.02 because the lurking variables explain some of the heart disease. This is still good evidence for a relationship if a significance level of 0.05 is used. Because the study started with a random sample of people who had no CHD and followed them forward in time, and because many lurking variables were measured and accounted for, it does give some evidence for causation. The next step might be an experiment that shows anger-