Using the chi-square test

Like our test for a population proportion, the chi-square test uses some approximations that become more accurate as we take more observations. Here is a rough rule for when it is safe to use this test.

Cell counts required for the chi-square test

You can safely use the chi-square test when no more than 20% of the expected counts are less than 5 and all individual expected counts are 1 or greater.

The cocaine study easily passes this test: all the expected cell counts are either 8 or 16. Here is a concluding example that outlines the examination of a two-way table.

EXAMPLE 6 Do angry people have a greater incidence of heart disease?

People who get angry easily tend to have more heart disease. That’s the conclusion of a study that followed a random sample of 12,986 people from three locations for about four years. All subjects were free of heart disease at the beginning of the study. The subjects took the Spielberger Trait Anger Scale test, which measures how prone a person is to sudden anger. Here are data for the 8474 people in the sample who had normal blood pressure. CHD stands for “coronary heart disease.’’ This includes people who had heart attacks and those who needed medical treatment for heart disease.

582

Anger Score
Low Moderate High
Sample size 3110 4731 633
CHD count 53 110 27
CHD percent 1.7% 2.3% 4.3%

There is a clear trend: as the anger score increases, so does the percentage who suffer heart disease. Is this relationship between anger and heart disease statistically significant?

The first step is to write the data as a two-way table by adding the counts of subjects who did not suffer from heart disease. We also add the row and column totals, which we need to find the expected counts.

Low anger Moderate anger High anger Total
CHD 53 110 27 190
No CHD 3057 4621 606 8284
Total 3110 4731 633 8474

We can now follow the steps for a significance test, familiar from Chapter 22.

The hypotheses. The chi-square method tests these hypotheses:

H0: no association between anger and CHD

Ha: some association between anger and CHD

The sampling distribution. We will see that all the expected cell counts are larger than 5, so we can safely apply the chi-square test. The two-way table of anger versus CHD has two rows and three columns. We will use critical values from the chi-square distribution with degrees of freedom df = .

The data. First find the expected cell counts. For example, the expected count of high-anger people with CHD is

583

Here is the complete table of observed and expected counts side by side:

Observed Expected
Low Moderate High Low Moderate High
CHD 53 110 27 69.73 106.08 14.19
No CHD 3057 4621 606 3040.27 4624.92 618.81

Looking at these counts, we see that the high-anger group has more CHD than expected and the low-anger group has less CHD than expected. This is consistent with what the percentages in Example 6 show. The chi-square statistic is

In practice, statistical software can do all this arithmetic for you. Look at the six terms that we sum to get . Most of the total comes from just one cell: high-anger people have more CHD than expected.

Significance? Look at the df = 2 line of Table 24.1. The observed chi-square is larger than the critical value 13.82 for . We have highly significant evidence () that anger and heart disease are related. Statistical software can give the actual -value. It is.

The conclusion. Can we conclude that proneness to anger causes heart disease? This is an observational study, not an experiment. It isn’t surprising to find that some lurking variables are confounded with anger. For example, people prone to anger are more likely than others to be those who drink and smoke. The study report used advanced statistics to adjust for many differences among the three anger groups. The adjustments raised the -value from to because the lurking variables explain some of the heart disease. This is still good evidence for a relationship if a significance level of 0.05 is used. Because the study started with a random sample of people who had no CHD and followed them forward in time, and because many lurking variables were measured and accounted for, it does give some evidence for causation. The next step might be an experiment that shows anger-prone people how to change. Will this reduce their risk of heart disease?

584