14-3
OBJECTIVES By the end of this section, I will be able to …
1 What Is a Nonparametric Hypothesis Test?
In Chapters 9, 10, 12, and 13, we learned how to perform hypothesis tests for population parameters, such as the population mean or the population proportion . To perform each of these parametric hypothesis tests, certain conditions need to be satisfied. For example, Section 9.4 showed that the required condition of a test for the population mean , when we have a small sample size, is that the population be normally distributed. However, what if we need to perform a test with a small sample and the population is not normal? We turn to one of the nonparametric hypothesis tests, the subject of this chapter.
Parametric hypothesis tests are used to test claims about a population parameter, such as the population mean or the population proportion . Often, parametric tests require that the population follow a particular distribution, such as the normal distribution.
Nonparametric hypothesis tests, also called distribution-free hypothesis tests, generally have fewer required conditions. In particular, nonparametric tests do not require the population to follow a particular distribution, such as the normal distribution.
Recall that we should not perform a parametric hypothesis test (such as the test for the population mean ) if the conditions are not met. Why, then, would a data analyst take a chance and use a parametric test when the conditions may not be satisfied? The answer is that there are advantages and disadvantages to each method.
Advantages of Nonparametric Hypothesis Tests
Disadvantages of Nonparametric Hypothesis Tests
14-4
2 The Efficiency of a Nonparametric Hypothesis Test
In general, parametric tests are more efficient than corresponding nonparametric tests. The efficiency of a nonparametric test is used to compare it with its corresponding parametric test.
The efficiency of a nonparametric hypothesis test is defined as the ratio of the sample size required for the corresponding parametric test to the sample size required for the nonparametric test, in order to achieve the same result (such as correctly rejecting the null hypothesis). The efficiency ratings are reported on the assumption that required conditions for both the parametric and the nonparametric tests have been met.
For example, in Section 14.3 we will learn about the Wilcoxon signed rank test for matched-pair data. The corresponding parametric test is the test for the difference in means for dependent samples that we learned about in Section 10.1. If a certain result is achieved by using the Wilcoxon signed rank test with a sample size of 100, an equivalent result may be obtained using the dependent-samples test with a sample size of 95. Thus, the efficiency of the Wilcoxon signed rank test (assuming that the conditions have been met for both tests) is
Thus, the Wilcoxon signed rank test is fairly efficient compared with the dependent-samples test. On the other hand, the sign test that we will learn about in Section 14.2 has an efficiency of only 0.63, meaning that the corresponding dependent-samples test requires a sample size of only 63 to achieve the same result that the sign test achieves with a sample size of 100. Thus, the sign test is less efficient than the Wilcoxon signed rank test. However, as we shall see, the conditions for performing the Wilcoxon signed rank test are stricter than for performing the sign test. As is often the case, there is a tradeoff between the efficiency of a test and the conditions required for performing the test.
Table 1 contains the efficiency ratings of the nonparametric (distribution-free) hypothesis tests that we will learn about in this chapter. The efficiency ratings are calculated under the assumption that the conditions for both the parametric and the non-parametric tests have been met.
Section | Situation | Parametric test | Nonparametric test |
Efficiency |
---|---|---|---|---|
14.2 | Matched pairs (dependent samples) |
test or test | Sign test | 0.63 |
14.3 | Matched pairs (dependent samples) |
test or test | Wilcoxon signed rank test |
0.95 |
14.4 | Two independent samples |
test or test | Wilcoxon rank sum test |
0.95 |
14.5 | Several independent samples |
Analysis of variance ( test) |
Kruskal-Wallis test |
0.95 |
14.6 | Correlation | Linear correlation | Rank correlation test |
0.91 |
14.7 | Randomness | No parametric test | Runs test | — |
Note: A data analyst could perform both the parametric test and the nonparametric test and leave it up to the client or the end user of the data to determine whether the greater efficiency of the parametric test is worth the cost of the more stringent required conditions.
In each case, the parametric test is more efficient than its nonparametric counterpart, though, of course, this greater efficiency comes at the cost of more stringent required conditions for the parametric tests. Thus, when the conditions for the parametric test are met, it is preferable to perform the parametric test as opposed to the nonparametric test.