14.4 Wilcoxon Rank Sum Test for Two Independent Samples

14-29

OBJECTIVE By the end of this section, I will be able to …

  1. Perform the Wilcoxon rank sum test for the difference in population medians, using two independent samples

In Section 14.3, we compared data from dependent samples. Here, in Section 14.4, we analyze data from independent samples. Recall from Section 10.1 that two samples are independent when the subjects selected for the first sample do not determine the subjects in the second sample. In Section 10.2, we learned how to perform a hypothesis test for the difference in population means using two independent samples. The two-sample test that we learned in that section required either that each sample size be large (at least 30) or that each population be normally distributed. Here, in Section 14.4, we will learn about the Wilcoxon rank sum test for the difference in population medians using two independent samples, which has less stringent conditions.

1 Wilcoxon Rank Sum Test for the Difference in Population Medians Using Two Independent Samples

The Wilcoxon rank sum test is equivalent to the Mann-Whitney test, another nonparametric test used in some textbooks to test for the difference in population medians. (By an “equivalent hypothesis test,” we mean a hypothesis test that is applicable to the same situations and always provides the same conclusions.)

The requirements for the Wilcoxon rank sum test are less strict, as we shall see.

The Wilcoxon rank sum test is a nonparametric hypothesis test in which the original data from two independent samples are transformed into their ranks. It tests whether the two population medians are equal or not.

In the Wilcoxon rank sum test, the two samples are temporarily combined, and the ranks of the combined data values are calculated. Then the ranks are summed separately for each sample.

EXAMPLE 13 Finding the ranks of combined data and summing the ranks for each sample

The following table shows the pulse rates in beats per minute for a random sample of five women and a random sample of four men.

  1. Combine the data sets and find the ranks.
  2. Find the sum of the ranks for the women and the sum of the ranks for the men.
    Women 66 77 57 62 68
    Men 79 71 68 71

Solution

  1. We temporarily combine the two samples and arrange the values in increasing order. We then rank the data values from smallest to largest, as shown in the following table. Note that we have two pulse rates of 68 beats per minute. Had these not been tied, they would have had ranks 4 and 5. We therefore assign to each the mean rank . Similarly, the two pulse rates of 71 beats per minute are assigned the mean rank .

    14-30

    Combined data 57 62 66 68 68 71 71 77 79
    Rank 1 2 3 4.5 4.5 6.5 6.5 8 9
  2. The sum of the ranks for the women is

    The sum of the ranks for the men is

NOW YOU CAN DO

Exercises 7–10.

Suppose we have two independent samples. Let and represent the population median of the first and second samples, respectively. Then we have the following two-tailed hypotheses for the Wilcoxon rank sum test:

The null hypothesis states that the two populations have the same median. If this is true, we expect that , the sum of the ranks for the first sample, will not be very different from , the sum of the ranks for the second sample. Large differences between and will therefore lead us to reject the null hypothesis that no difference exists in the population medians. When the conditions are met, the distribution of follows an approximately normal distribution.

When performing the Wilcoxon rank sum test, we need to find only, the sum of the ranks for the first sample. It is not necessary to find the sum of the ranks for the second sample, .

Wilcoxon Rank Sum Test for Two Independent Samples

The requirements are that (a) the samples are independent random samples, (b) each sample size is larger than 10, and (c) the shapes of the distributions are the same. It is not required that the populations be normally distributed. Note: In this section, we assume that condition (c) is satisfied.

  • Step 1 State the hypotheses.

    Choose one of the forms in Table 13.

    Table 14.41: Table 13 Hypotheses for the Wilcoxon rank sum test
    Null hypothesis Alternative hypothesis Type of test
    Right-tailed test
    Left-tailed test
    Two-tailed test
  • Step 2 Find the critical value and state the rejection rule.

    Use Table 14 to find the critical value and the rejection rule.

    Table 14.42: Table 14 Critical values and rejection rules for the Wilcoxon rank sum test
    Form of hypothesis test
    Right-tailed Left-tailed Two -tailed
    Rejection rule

    14-31

  • Step 3 Find the value of the test statistic .

    where

    and represent the sample sizes for samples 1 and 2, respectively, and .

  • Step 4 State the conclusion and the interpretation. Compare the test statistic with the critical value, using the rejection rule.

EXAMPLE 14 Performing the Wilcoxon rank sum test

We are interested in testing whether the population median pulse rate for women (Population 1) is less than that for men (Population 2). We use the data from Example 13 supplemented with an additional seven women and seven men, sampled randomly and independently. The data are presented below.8 Perform the Wilcoxon rank sum test at level of significance .

Women 66 77 57 62 68 78 73 81 84 69 62 79
Men 79 71 68 71 68 86 73 58 68 74 78

Solution

The data were obtained using random samples. Also, we assume that the distributions of the populations have the same shape. Also, we have and , so the conditions for performing the Wilcoxon rank sum test are satisfied.

  • Step 1 State the hypotheses. The key words “less than” indicate that we have a left-tailed test, from Table 13:

    where and represent the population median pulse rates of the first (women) and second (men) samples, respectively.

  • Step 2 Find the critical value and state the rejection rule. The level of significance is , so from Table 14 our critical value is , and our rejection rule is to reject if .
  • Step 3 Find the value of the test statistic. We combine the two samples and arrange in increasing order. We then rank the data values from smallest to largest, as shown in the following table, assigning ties to the mean rank value.

    Combined data 57 58 62 62 66 68 68 68 68 69 71 71
    Rank 1 2 3.5 3.5 5 7.5 7.5 7.5 7.5 10 11.5 11.5
    Combined data 73 73 74 77 78 78 79 79 81 84 86
    Rank 13.5 13.5 15 16 17.5 17.5 19.5 19.5 21 22 23

    The sum of the ranks for the women is

    14-32

    We have

    So that

  • Step 4 State the conclusion and the interpretation. We said in Step 2 that we would reject if . But , which is not ≤ −1.645. Therefore, our conclusion is to not reject . There is insufficient evidence that the population median pulse rate for women is less than that for men.

NOW YOU CAN DO

Exercises 11–14.

EXAMPLE 15 Performing the Wilcoxon rank sum test using technology

image

A study investigated whether there was a difference in physical activity levels between female adolescents with anorexia nervosa (AN) and those without AN.9 In this study, the amount of physical activity (in minutes for the year) of randomly selected female adolescents with AN (patients) and randomly selected female adolescents without AN (controls) was estimated by interviewing their mothers. The samples were drawn independently. Use Minitab and SPSS to test whether the population median minutes of physical activity for the patients differs from that for the controls, using level of significance .

Solution

Because both the patients and the controls were randomly selected, because and , and because we assume that both population shapes are the same, we may proceed with the hypothesis test.

  • Step 1 State the hypotheses.

    where and represent the population median activity level (in minutes) of the patients and the controls, respectively.

  • Step 2 Find the critical value and state the rejection rule. The level of significance is , so our critical value is . We will reject if or if .
  • Step 3 Find the value of the test statistic. We use the instructions provided in the Step-by-Step Technology Guide at the end of this section. Figure 16 shows the Minitab results from the Mann-Whitney test, which is equivalent to the Wilcoxon rank sum test for independent samples. Figure 17 shows the SPSS results from the Mann-Whitney test.

    image
    Figure 14.16: FIGURE 16 Minitab results.

    14-33

    image
    Figure 14.17: FIGURE 17 SPSS results.

    The highlighted represents in our notation. We have

    So that

  • Step 4 State the conclusion and the interpretation. We said we will reject if or if . We have , which is greater than 2.58. Therefore, we reject . There is evidence that the population median amount of physical activity for female adolescents with AN differs from the population median amount of physical activity for female adolescents without AN.