632
OBJECTIVES By the end of this section, I will be able to …
According to the Adobe Digital Index, the market share for the leading Internet browsers (both desktop and mobile) in June 2014 was as follows: Google Chrome, 32%; Microsoft Internet Explorer, 31%; others, 37%. Change is rapid in the online environment. Have these market shares changed since June 2014? How would we go about performing a hypothesis test to determine whether market shares have changed significantly? In Section 11.1, we examine this question using a new type of hypothesis test called a goodness of fit test. We begin by first considering a new type of random variable that is used to represent categorical data.
1 The Multinomial Random Variable
Recall from Chapter 1 that categorical (qualitative) variables take values that can be classified into categories. In Chapter 6, we considered binomial random variables, for which there are only two possible outcomes. Now, let's consider the following type of random variable, which can have more than two possible values.
Multinomial Random Variable
A random variable is multinomial if it satisfies each of the following conditions:
Data from a multinomial random variable are said to follow a multinomial distribution.
Note: The binomial distribution may be considered a special case of the multinomial distribution, with .
For example, suppose 30% of the residents of a particular town are Democrats,30% are Republicans, and 40% are Independents. If we select residents at random, then the number of Democrats, Republicans, and Independents observed follows a multinomial distribution, with
and
EXAMPLE 1 Identifying a Multinomial Random Variable
For each of the following, determine whether the random variable is multinomial.
633
Solution
NOW YOU CAN DO
Exercises 5–8
Next, recall from Section 6.2 that the formula for finding the expected value (mean) of a binomial random variable having trials and probability of success is
For a multinomial random variable, the expected frequency of the ith category is
where represents the number of trials, and represents the population proportion for the ith category.
EXAMPLE 2 Finding the expected frequencies
According to the Adobe Digital Index, the market share for the leading Internet browsers (both desktop and mobile) in June 2014 was as shown in Table 1. Let of a randomly selected Internet user.
Browser | Relative frequency |
---|---|
Google Chrome | 0.32 |
Microsoft Internet Explorer | 0.31 |
Other | 0.37 |
Solution
There are possible outcomes: Google Chrome, Microsoft Internet Explorer, and Other. Assigning probabilities using the relative frequency method, we have the following hypothesized proportions for each browser:
and
Therefore, is a valid multinomial random variable.
634
Category | |
---|---|
Google Chrome | |
Microsoft Internet Explorer (IE) | |
Other |
As a check on the calculations, we should have . In this case,
NOW YOU CAN DO
Exercises 9–12.
YOUR TURN #1
Publishers Weekly reported that, in 2014, the book format market share was as follows: paperbacks, 41%; hard covers, 34%; e-books, 13%; and all other formats, 12%. Suppose a survey was conducted this year of 2000 books purchased.
(The solutions are shown in Appendix A.)
What Do These Expected Frequencies Mean?
Recall that the expected value of a random variable refers to the long-run mean of that random variable after an arbitrarily large number of trials. For example, if we repeatedly took samples of 200 Internet users and asked about browser preference, the mean number of persons who used Google Chrome would approach as we took more and more different samples, if the proportions given in Table 1 are correct. Similarly, because 31% of the entire population of Internet users use Microsoft IE, we would expect about 31% of any given sample of 200 Internet users to use Microsoft IE, because the sample is a subset of the population. This of course raises the question: Are the proportions in Table 1 still true? That is the type of question we will learn how to address next.
2 What Is a Goodness of Fit Test?
Do the 2014 market shares still hold true today? In other words, has the distribution of the multinomial random variable browser given in Table 1 changed since June 2014? To determine this, we introduce a new type of hypothesis test, called a goodness of fit test.
Goodness of Fit Test
A goodness of fit test is a hypothesis test used to determine whether a random variable follows a particular distribution. In a goodness of fit test, the hypotheses are
635
For Example 2, the null hypothesis completely specifies each of the probabilities in the relative frequency distribution, as follows:
The alternative hypothesis simply denies the claim made by the null hypothesis:
.
In other words, claims that the browser market shares have changed since June 2014.
Developing Your Statistical Sense
Fitting the Model to the Data
Now, a goodness of fit test sounds like something you do in a clothing store dressing room. Actually, the analogy to clothes is rather appropriate. Suppose winter is coming and you are in the market for a new pair of gloves. You find one pair that is especially attractive, but the gloves don't fit your hands. What do you do? You reject the ill-fitting gloves and search for a new pair. In statistics, the gloves represent the models and your hands represent the actual “hard data” observed in the sample.
The null hypothesis represents what is called a model, a working theory of how the population proportions are distributed. Our working model of how the market shares are distributed is stated in the null hypothesis:
Model 1.
Of course, we could also try other models if we think the market has changed, such as the following:
Model 2.
Model 3.
In hypothesis testing, we “try on” only one model at a time.
In statistics, a goodness of fit test determines if the actual “hard data” observed in the sample are consistent with the proportions stated in the null hypothesis. Market researchers would collect data on the actual preferences of a sample of 100 real Internet users in order to determine whether or not the market shares have changed. The sample is summarized in a set of observed frequencies of Internet users who prefer the various browsers. The goodness of fit test then compares these observed frequencies with the expected frequencies found in Example 2.
How a Goodness of Fit Test Works
The goodness of fit test is based on a comparison of the observed frequencies (sample data) with the expected frequencies when is true. That is, we compare what we actually see with what we would expect to see if were true. If the difference between the observed and expected frequencies is large, we reject .
The difference between the observed and expected frequencies is measured by the test statistic, . As usual, it comes down to how large a difference is large.
636
Test Statistic for the Goodness of Fit Test
For a multinomial random variable with categories and trials, let represent the observed frequency for category , and let represent the expected frequency for category . Then the test statistic for a goodness of fit test
approximately follows a (chi-square) distribution with degrees of freedom (df), if the following conditions are satisfied:
Students may want to review the characteristics of the distribution (Chapter 10, page 618) and the procedure for finding critical values for a right-tailed test (Chapter 10, page 620).
If the conditions are not satisfied, then it may be possible to combine two or more categories so that the conditions may then be fulfilled.
EXAMPLE 3 Calculating
Suppose the observed frequencies of browser preference in Table 3 come from a survey taken this year of 200 Internet users.
Browser | Observed frequency |
---|---|
Google Chrome | 80 |
Microsoft Internet Explorer | 62 |
Other | 58 |
Calculate the test statistic by comparing the observed frequencies from Table 3 with the expected frequencies calculated in Table 2 of Example 2.
Solution
The observed frequencies are found in Table 3, and the expected frequencies are given in Table 2. Table 4 then provides the quantities needed to calculate . Then
Category | ||||||
---|---|---|---|---|---|---|
Chrome | 0.32 | 80 | 64 | 16 | 256 | |
IE | 0.31 | 62 | 62 | 0 | 0 | |
Other | 0.37 | 58 | 74 | −16 | −256 |
NOW YOU CAN DO
Exercises 13–18.
637
YOUR TURN #2
Publishers Weekly reported that, in 2014, the book format market share was as follows: paperbacks, 41%; hard covers, 34%; e-books, 13%; and all other formats, 12%. Suppose a survey was conducted this year of 2000 books purchased, with the following book sales: 810 paperbacks, 680 hard covers, 280 e-books, and 230 others. Calculate the test statistic .
(The solution is shown in Appendix A.)
3 Performing the Goodness of Fit Test
The goodness of fit test may be performed using (a) the critical-value method or (b) the p-value method. We start with the critical value method.
Goodness of Fit Test: Critical-Value Method
The following conditions must be met:
The expected frequency for the ith category is , where represents the number of trials and represents the population proportion for the ith category.
Step 3 Calculate .
where = observed frequency, and = expected frequency.
All hypothesis tests in this chapter are right-tailed tests, so that we need to find for the area to the right of the critical value only.
EXAMPLE 4 Critical-value method for the goodness of fit test
Test whether the Internet browser market shares from Example 2 have changed since June 2014, using level of significance .
Solution
Step 1 State the hypotheses and check the conditions. The hypotheses are:
Checking the conditions, the expected frequencies from Table 2 are
Because none of these expected frequencies is less than 1, and none of the expected frequencies is less than 5, the conditions for performing the goodness of fit test are satisfied.
Step 2 Find the critical value, , and state the rejection rule. We have degrees of freedom . Turning to the table (Table E in the Appendix) in the column labeled and the row containing , we find , as shown in Figure 1. The rejection rule is “Reject if ."
638
Evidence exists at level of significance that the random variable browser does not follow the distribution specified in . In other words, evidence exists that the market shares for Internet browsers have changed.
NOW YOU CAN DO
Exercises 19–22.
YOUR TURN #3
Test using level of significance whether the book format market shares have changed, using the information from Your Turn #1 on page 634 and Your Turn #2 on page 637.
(The solution is shown in Appendix A.)
Developing Your Statistical Sense
Be Careful How You Interpret the Conclusion
Note carefully what this conclusion says and what it doesn't say. The goodness of fit test provides evidence that the random variable does not follow the distribution specified in . In particular, the conclusion does not state, for example, that Chrome's proportion is significantly greater than it was in 2014. Informally, we can compare the observed frequency of 80 with the expected frequency of 64 for the Chrome browser and note that there appears to be evidence of an increase in market share for Chrome. But this is only informal and is not part of the hypothesis test. It is a common error in statistical analysis to form conclusions beyond what the hypothesis test is actually testing.
639
Next, we turn to the p-value method. The goodness of fit test is a right-tailed test, so the p-value for the statistic is defined as the area under the curve to the right of the test statistic , as shown in Figure 3. That is,
Goodness of Fit Test: p-Value Method
The following conditions must be met:
At most, 20% of the expected frequencies are less than 5.
The expected frequency for the ith category is , where represents the number of trials and represents the population proportion for the ith category.
Step 2 Calculate .
where = observed frequency, and = expected frequency.
Step 3 Find the p-value.
(see Figure 3)
EXAMPLE 5 p-Value method for the goodness of fit test using technology
Table 5 contains the distribution of violent crime in New York City in 2012.2 Suppose that a random sample of 1000 violent crimes in New York City yielded the counts shown in Table 6. Test whether the population proportions have changed since 2012, using the p-value method and level of significance .
Murder | Rape | Robbery | Assault |
---|---|---|---|
0.01 | 0.04 | 0.35 | 0.60 |
640
Murder | Rape | Robbery | Assault |
---|---|---|---|
6 | 50 | 350 | 594 |
Solution
Reject if the p-value .
What Results Might We Expect?
Before we do the formal hypothesis test, let's try to figure out what the conclusion might be. Figure 4 is a clustered bar graph (see Section 2.1) of the observed and expected frequencies for each of the four categories. If were true, then, for each category, we would expect the red bars (observed frequencies) and blue bars (expected frequencies) to have somewhat similar heights. In fact, the heights of the bars are fairly similar for all four categories, indicating not much difference between the crimes that were observed and the crimes that were expected. Thus, we might expect to not reject .
First, we need to find the expected frequencies. We have , so the expected frequencies are as shown here.
Category | |
---|---|
Murder | |
Rape | |
Robbery | |
Assault |
641
Next, check the conditions for this test. Because (a) none of the expected frequencies is less than 1 and (b) no more than 20% of the expected frequencies are less than 5, we may proceed. We use the instructions provided in the Step-by-Step Technology Guide at the end of this section.
Step 2 Find the test statistic . The TI-83/84 results in Figure 5 tell us that
Step 3 Find the p-value. Figure 5 also tells us that
This p-value, for the χ2 distribution with 3 degrees of freedom, is shown in Figure 6.
Figure 7a shows the TI-84 output for the test, and Figure 7b shows the SPSS output for the test, confirming our test statistic of 4.16 and p-value of 0.2447.
NOW YOU CAN DO
Exercises 23–26.