14-29
OBJECTIVE By the end of this section, I will be able to …
In Section 14.3, we compared data from dependent samples. Here, in Section 14.4, we analyze data from independent samples. Recall from Section 10.1 that two samples are independent when the subjects selected for the first sample do not determine the subjects in the second sample. In Section 10.2, we learned how to perform a hypothesis test for the difference in population means using two independent samples. The two-sample test that we learned in that section required either that each sample size be large (at least 30) or that each population be normally distributed. Here, in Section 14.4, we will learn about the Wilcoxon rank sum test for the difference in population medians using two independent samples, which has less stringent conditions.
1 Wilcoxon Rank Sum Test for the Difference in Population Medians Using Two Independent Samples
The Wilcoxon rank sum test is equivalent to the Mann-Whitney test, another nonparametric test used in some textbooks to test for the difference in population medians. (By an “equivalent hypothesis test,” we mean a hypothesis test that is applicable to the same situations and always provides the same conclusions.)
The requirements for the Wilcoxon rank sum test are less strict, as we shall see.
The Wilcoxon rank sum test is a nonparametric hypothesis test in which the original data from two independent samples are transformed into their ranks. It tests whether the two population medians are equal or not.
In the Wilcoxon rank sum test, the two samples are temporarily combined, and the ranks of the combined data values are calculated. Then the ranks are summed separately for each sample.
EXAMPLE 13 Finding the ranks of combined data and summing the ranks for each sample
The following table shows the pulse rates in beats per minute for a random sample of five women and a random sample of four men.
Women | 66 | 77 | 57 | 62 | 68 |
Men | 79 | 71 | 68 | 71 |
Solution
We temporarily combine the two samples and arrange the values in increasing order. We then rank the data values from smallest to largest, as shown in the following table. Note that we have two pulse rates of 68 beats per minute. Had these not been tied, they would have had ranks 4 and 5. We therefore assign to each the mean rank . Similarly, the two pulse rates of 71 beats per minute are assigned the mean rank .
14-30
Combined data | 57 | 62 | 66 | 68 | 68 | 71 | 71 | 77 | 79 |
Rank | 1 | 2 | 3 | 4.5 | 4.5 | 6.5 | 6.5 | 8 | 9 |
The sum of the ranks for the women is
The sum of the ranks for the men is
NOW YOU CAN DO
Exercises 7–10.
Suppose we have two independent samples. Let and represent the population median of the first and second samples, respectively. Then we have the following two-tailed hypotheses for the Wilcoxon rank sum test:
The null hypothesis states that the two populations have the same median. If this is true, we expect that , the sum of the ranks for the first sample, will not be very different from , the sum of the ranks for the second sample. Large differences between and will therefore lead us to reject the null hypothesis that no difference exists in the population medians. When the conditions are met, the distribution of follows an approximately normal distribution.
When performing the Wilcoxon rank sum test, we need to find only, the sum of the ranks for the first sample. It is not necessary to find the sum of the ranks for the second sample, .
Wilcoxon Rank Sum Test for Two Independent Samples
The requirements are that (a) the samples are independent random samples, (b) each sample size is larger than 10, and (c) the shapes of the distributions are the same. It is not required that the populations be normally distributed. Note: In this section, we assume that condition (c) is satisfied.
Step 1 State the hypotheses.
Choose one of the forms in Table 13.
Null hypothesis | Alternative hypothesis | Type of test |
---|---|---|
Right-tailed test | ||
Left-tailed test | ||
Two-tailed test |
Step 2 Find the critical value and state the rejection rule.
Use Table 14 to find the critical value and the rejection rule.
Form of hypothesis test | |||
---|---|---|---|
Right-tailed | Left-tailed | Two -tailed | |
Rejection rule |
14-31
Step 3 Find the value of the test statistic .
where
and represent the sample sizes for samples 1 and 2, respectively, and .
EXAMPLE 14 Performing the Wilcoxon rank sum test
We are interested in testing whether the population median pulse rate for women (Population 1) is less than that for men (Population 2). We use the data from Example 13 supplemented with an additional seven women and seven men, sampled randomly and independently. The data are presented below.8 Perform the Wilcoxon rank sum test at level of significance .
Women | 66 | 77 | 57 | 62 | 68 | 78 | 73 | 81 | 84 | 69 | 62 | 79 |
Men | 79 | 71 | 68 | 71 | 68 | 86 | 73 | 58 | 68 | 74 | 78 |
Solution
The data were obtained using random samples. Also, we assume that the distributions of the populations have the same shape. Also, we have and , so the conditions for performing the Wilcoxon rank sum test are satisfied.
Step 1 State the hypotheses. The key words “less than” indicate that we have a left-tailed test, from Table 13:
where and represent the population median pulse rates of the first (women) and second (men) samples, respectively.
Step 3 Find the value of the test statistic. We combine the two samples and arrange in increasing order. We then rank the data values from smallest to largest, as shown in the following table, assigning ties to the mean rank value.
Combined data | 57 | 58 | 62 | 62 | 66 | 68 | 68 | 68 | 68 | 69 | 71 | 71 |
Rank | 1 | 2 | 3.5 | 3.5 | 5 | 7.5 | 7.5 | 7.5 | 7.5 | 10 | 11.5 | 11.5 |
Combined data | 73 | 73 | 74 | 77 | 78 | 78 | 79 | 79 | 81 | 84 | 86 | |
Rank | 13.5 | 13.5 | 15 | 16 | 17.5 | 17.5 | 19.5 | 19.5 | 21 | 22 | 23 |
The sum of the ranks for the women is
14-32
We have
So that
NOW YOU CAN DO
Exercises 11–14.
EXAMPLE 15 Performing the Wilcoxon rank sum test using technology
A study investigated whether there was a difference in physical activity levels between female adolescents with anorexia nervosa (AN) and those without AN.9 In this study, the amount of physical activity (in minutes for the year) of randomly selected female adolescents with AN (patients) and randomly selected female adolescents without AN (controls) was estimated by interviewing their mothers. The samples were drawn independently. Use Minitab and SPSS to test whether the population median minutes of physical activity for the patients differs from that for the controls, using level of significance .
Solution
Because both the patients and the controls were randomly selected, because and , and because we assume that both population shapes are the same, we may proceed with the hypothesis test.
Step 1 State the hypotheses.
where and represent the population median activity level (in minutes) of the patients and the controls, respectively.
Step 3 Find the value of the test statistic. We use the instructions provided in the Step-by-Step Technology Guide at the end of this section. Figure 16 shows the Minitab results from the Mann-Whitney test, which is equivalent to the Wilcoxon rank sum test for independent samples. Figure 17 shows the SPSS results from the Mann-Whitney test.
14-33
The highlighted represents in our notation. We have
So that