Chapter 19: Simulation

Thinking about independence

Before discussing more elaborate simulations, it is worth discussing the concept of independence further. We said earlier that independence can be verified only by observing many repetitions of random phenomena. It is probably more accurate to say that a lack of independence can be verified only by observing many repetitions of random phenomena. How does one recognize that two random phenomena are not independent? For example, how can we tell if tosses of a fair coin (that is, one for which the probability of a head is 0.5 and the probability of a tail is 0.5) are not independent?

One approach might be to apply the definition of “independence.” For a sequence of tosses of a fair coin, one could compute the proportion of times in the sequence that a toss is followed by the same outcome—in other words, the frequency with which a head is followed by a head or a tail is followed by a tail. This proportion should be close to 0.5 if tosses are independent (knowing the outcome of one toss does not change the probabilities for outcomes of the next) and if many tosses have been observed.

EXAMPLE 3 Investigating independence

Suppose we tossed a fair coin 15 times and obtained the following sequence of outcomes:

Toss:	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15
Outcome:	H	H	H	T	H	H	T	T	T	T	T	H	H	T	T

Page 451

For the first 14 tosses, the following toss is the same nine times. The proportion of times a toss is followed by the same outcome is

proportion $= \frac{9}{14} = 0.64$

For so few tosses, this would not be considered a large departure from 0.5.

Unfortunately, if the proportion of times a head is followed by a head or a tail is followed by a tail is close to 0.5, this does not necessarily imply that the tosses are independent. For example, suppose that instead of tossing the coin we simply placed the coin heads up or tails up according to the following pattern:

Trial:	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15
Outcome:	H	H	T	T	H	H	T	T	H	H	T	T	H	H	T

We begin with two heads, followed by two tails, followed by two heads, etc. If we know the previous outcomes, we know exactly what the next outcome will be. Successive outcomes are not independent. However, looking at the first 14 outcomes, the proportion of times in this sequence that a head is followed by a head or a tail is followed by a tail is

proportion $= \frac{7}{14} = 0.5$

Thus, our approach can help us recognize when independence is lacking, but not when independence is present.

Another method for assessing independence is based on the concept of correlation, which we discussed in Chapter 14. If two random phenomena have numerical outcomes, and we observe both phenomena in a sequence of n trials, we can compute the correlation for the resulting data. If the random phenomena are independent, there will be no straight-line relationship between them, and the correlation should be close to 0.

It is not necessarily true that two random phenomena are independent if their correlation is 0. In Exercise 14.24 (page 334), there is a clear curved relationship between speed and mileage, but the correlation is 0. Independence implies no relationship at all, but correlation measures the strength of only a straight-line relationship.

Because independence implies no relationship, we would expect to see no overall pattern in a scatterplot of the data if the variables are independent. Looking at scatterplots is another method for determining if independence is lacking.

Page 452

Was he good or was he lucky? When a baseball player hits .300, everyone applauds. A .300 hitter gets a hit in 30% of times at bat. Could a .300 year just be luck? Typical major leaguers bat about 500 times a season and hit about .260. A hitter’s successive tries seem to be independent. From this model, we can calculate or simulate the probability of hitting .300. It is about 0.025. Out of 100 run-of-the-mill major league hitters, two or three each year will bat .300 just because they were lucky.

Many methods for assessing independence exist. For example, if trials are not independent and, say, tossing a head increases the probability that the next toss is also a head, then in a sequence of tosses, we might expect to see unusually long runs of heads. We mentioned this idea of unusually long runs in Example 5 (page 410) of Chapter 17. Unusually long runs of made free throws would be expected if a basketball player has a “hot hand.” However, careful study has shown that runs of baskets made or missed are no more frequent in basketball than would be expected if each shot is independent of the player’s previous shots.