Chapter 1. Chi-square Goodness of Fit Test

Statistical Applets

Set the Bag Count n and the significance level (α) for your hypothesis test with the sliders, then click POUR NEW BAG to pour a bag of candies from the hopper (i.e. take a sample from the underlying population). The table on the right will show the number of candies for each color in the bag, along with the results of the chi-square goodness-of-fit test against the hypothesis that all colors are equally represented in the hopper. Click SHOW HOPPER VALUES to reveal the true population proportions, and click NEW HOPPER to start over with a new hopper (that has a new set of population proportions).

Suppose you have a large "hopper" of colored candies. You believe each of the 5 colors are represented equally often in the hopper (that is, you think the hopper contains 20% of each color), but you're not sure. To test this hypothesis, you can pour a "bag" of candies, count the number of each color in the bag, and perform a chi-square goodness-of-fit test on the resulting frequencies. The goodness-of-fit test can be used to test any null hypothesis about the underlying population frequencies; in this case we test against the null hypothesis that all 5 colors have the same likelihood of coming out of the hopper.

Try pouring a few bags from the hopper (controlling the size of each bag using the Bag Count slider) and observe the frequencies and the chi-square test for each bag. A rejected null hypothesis indicates evidence that the color proportions are not equal. After testing a few bags, click the SHOW HOPPER VALUES button to reveal the true population proportions for the hopper. You can then click to get a new hopper (with a new set of population proportions) and try again.

1.

Click NEW HOPPER to generate a fresh hopper of candies. Then click SHOW HOPPER VALUES to reveal the true population proportions of each candy color. In this hopper, what percentage of the candies are red?

2
Correct.
Check the data and try again.
Incorrect.

2.

Are the population proportions in this hopper truly equal? That is, if you draw a candy at random from the hopper, are each of the colors equally likely to come out?

Incorrect. The proportions of colors in this hopper are not exactly equal to each other.
Correct. The proportions of colors in this hopper are not exactly equal to each other.

3.

Draw 5 bags of 20 candies each from the hopper, and note for each bag whether or not the null hypothesis—that all colors are equally likely to be drawn—is rejected.

3
Nice job.
To answer this question you must draw 5 bags of 20 candies each from the hopper, noting for each bag whether or no the null hypothesis is rejected.
Incorrect.

4.

Now draw 5 bags of 200 candies each from the hopper. For how many of these larger bags is the null hypothesis rejected?

3
Nice job.
To answer this question you must draw 5 bags of 20 candies each from the hopper, noting for each bag whether or no the null hypothesis is rejected.
Incorrect.

5.

You almost certainly found that the null hypothesis was more likely to be rejected with bags of 200 candies than with bags of 20 candies. Does this result indicate that the Chi-square test is flawed in some way? Or does it simply reveal a truth about hypothesis testing? Explain your answer.

There's nothing wrong with the Chi-square test. The likelihood that a null hypothesis for any statistical test will be rejected is dependent on a number of factors, one of which is the sample size. The larger the sample, the more similar the sample's distribution will be to the population distribution. This has two practical effects on an inferential statistic, like Chi-square. First, if the null hypothesis is, in fact, incorrect, the sample will be less likely to mistakenly match the null hypothesis due to random sampling error. And second, the "error" term in the sample statistic will be smaller (reflecting the central limit theorem), so the test statistic will be larger and therefore result in a smaller P-value—and a larger probability that the null hypothesis will be rejected.