14 Nonparametric Statistics

14.1 Introduction to Nonparametric Statistics

14-3

OBJECTIVES By the end of this section, I will be able to …

Explain what a nonparametric hypothesis test is, and why we use it.
Describe what is meant by the efficiency of a nonparametric test.

1 What Is a Nonparametric Hypothesis Test?

In Chapters 9, 10, 12, and 13, we learned how to perform hypothesis tests for population parameters, such as the population mean or the population proportion . To perform each of these parametric hypothesis tests, certain conditions need to be satisfied. For example, Section 9.4 showed that the required condition of a test for the population mean , when we have a small sample size, is that the population be normally distributed. However, what if we need to perform a test with a small sample and the population is not normal? We turn to one of the nonparametric hypothesis tests, the subject of this chapter.

Parametric hypothesis tests are used to test claims about a population parameter, such as the population mean or the population proportion . Often, parametric tests require that the population follow a particular distribution, such as the normal distribution.

Nonparametric hypothesis tests, also called distribution-free hypothesis tests, generally have fewer required conditions. In particular, nonparametric tests do not require the population to follow a particular distribution, such as the normal distribution.

Recall that we should not perform a parametric hypothesis test (such as the test for the population mean ) if the conditions are not met. Why, then, would a data analyst take a chance and use a parametric test when the conditions may not be satisfied? The answer is that there are advantages and disadvantages to each method.

Advantages of Nonparametric Hypothesis Tests

Nonparametric methods may be used on a greater variety of data because they require fewer conditions than their parametric counterparts. For this reason, it is less likely that nonparametric hypothesis tests will be performed inappropriately.
Nonparametric methods can be applied to categorical (qualitative) data, such as class standing (freshman, sophomore, junior, or senior).
For certain nonparametric procedures, the manual computations tend to be easier than their parametric counterparts. (However, see Disadvantage 3 below.)

Disadvantages of Nonparametric Hypothesis Tests

Nonparametric hypothesis tests are less efficient than parametric tests. This means that, for a given level of significance , nonparametric tests require a larger sample size to reject a null hypothesis (more on efficiency below).
Nonparametric tests replace the actual data values with either signs (positive or negative) or ranks. Thus, the exact data values are wasted. For example, in the nonparametric sign test performed in Section 14.2, the actual data values are discarded and replaced with positive or negative signs.
Because the use of nonparametric hypothesis tests is less widespread, graphing calculators and statistical software often do not have dedicated procedures for these tests.

14-4

2 The Efficiency of a Nonparametric Hypothesis Test

In general, parametric tests are more efficient than corresponding nonparametric tests. The efficiency of a nonparametric test is used to compare it with its corresponding parametric test.

The efficiency of a nonparametric hypothesis test is defined as the ratio of the sample size required for the corresponding parametric test to the sample size required for the nonparametric test, in order to achieve the same result (such as correctly rejecting the null hypothesis). The efficiency ratings are reported on the assumption that required conditions for both the parametric and the nonparametric tests have been met.

For example, in Section 14.3 we will learn about the Wilcoxon signed rank test for matched-pair data. The corresponding parametric test is the test for the difference in means for dependent samples that we learned about in Section 10.1. If a certain result is achieved by using the Wilcoxon signed rank test with a sample size of 100, an equivalent result may be obtained using the dependent-samples test with a sample size of 95. Thus, the efficiency of the Wilcoxon signed rank test (assuming that the conditions have been met for both tests) is

Thus, the Wilcoxon signed rank test is fairly efficient compared with the dependent-samples test. On the other hand, the sign test that we will learn about in Section 14.2 has an efficiency of only 0.63, meaning that the corresponding dependent-samples test requires a sample size of only 63 to achieve the same result that the sign test achieves with a sample size of 100. Thus, the sign test is less efficient than the Wilcoxon signed rank test. However, as we shall see, the conditions for performing the Wilcoxon signed rank test are stricter than for performing the sign test. As is often the case, there is a tradeoff between the efficiency of a test and the conditions required for performing the test.

Table 1 contains the efficiency ratings of the nonparametric (distribution-free) hypothesis tests that we will learn about in this chapter. The efficiency ratings are calculated under the assumption that the conditions for both the parametric and the non-parametric tests have been met.

Table 14.1: Table 1 Efficiency of nonparametric tests compared with parametric tests

Section	Situation	Parametric test	Nonparametric test	Efficiency
14.2	Matched pairs (dependent samples)	test or test	Sign test	0.63
14.3	Matched pairs (dependent samples)	test or test	Wilcoxon signed rank test	0.95
14.4	Two independent samples	test or test	Wilcoxon rank sum test	0.95
14.5	Several independent samples	Analysis of variance ( test)	Kruskal-Wallis test	0.95
14.6	Correlation	Linear correlation	Rank correlation test	0.91
14.7	Randomness	No parametric test	Runs test	—

Note: A data analyst could perform both the parametric test and the nonparametric test and leave it up to the client or the end user of the data to determine whether the greater efficiency of the parametric test is worth the cost of the more stringent required conditions.

In each case, the parametric test is more efficient than its nonparametric counterpart, though, of course, this greater efficiency comes at the cost of more stringent required conditions for the parametric tests. Thus, when the conditions for the parametric test are met, it is preferable to perform the parametric test as opposed to the nonparametric test.