Simulation basics

447

Simulation is an effective tool for finding probabilities of complex events once we have a trustworthy probability model. We can use random digits to simulate many repetitions quickly. The proportion of repetitions on which an event occurs will eventually be close to its probability, so simulation can give good estimates of probabilities. The art of simulation is best learned from a series of examples.

EXAMPLE 1 Doing a simulation

Toss a coin 10 times. What is the probability of a run of at least three consecutive heads or three consecutive tails?

Step 1. Give a probability model. Our model for coin tossing has two parts:

  • Each toss has probabilities 0.5 for a head and 0.5 for a tail.

  • Tosses are independent of each other. That is, knowing the outcome of one toss does not change the probabilities for the outcomes of any other toss.

Step 2. Assign digits to represent outcomes. Digits in Table A of random digits will stand for the outcomes, in a way that matches the probabilities from Step 1. We know that each digit in Table A has probability 0.1 of being any one of 0, 1, 2, 3, 4, 5, 6, 7, 8, or 9 and that successive digits in the table are independent. Here is one assignment of digits for coin tossing:

  • One digit simulates one toss of the coin.

  • Odd digits represent heads; even digits represent tails.

image Really random digits For purists, the RAND Corporation long ago published a book titled One Million Random Digits. The book lists 1,000,000 digits that were produced by a very elaborate physical randomization and really are random. An employee of RAND once told one of us that this is not the most boring book that RAND has ever published . . .

This works because the five odd digits give probability 5/10 to heads (but any other assignment where half the digits represent heads is equally good). Successive digits in the table simulate independent tosses.

Step 3. Simulate many repetitions. Ten digits simulate 10 tosses, so looking at 10 consecutive digits in Table A simulates one repetition. Read many groups of 10 digits from the table to simulate many repetitions. Be sure to keep track of whether or not the event we want (a run of three heads or three tails) occurs on each repetition.

448

Here are the first three repetitions, starting at line 101 in Table A. We have underlined all runs of three or more heads or tails.

Repetition 1 Repetition 2 Repetition 3
Digits 1 9 2 2 3 9 5 0 3 4 0 5 7 5 6 2 8 7 1 3 9 6 4 0 9 1 2 5 3 1
Heads/tails H H T T H H H T H T T H H H T T T H H H H T T T H H T H H H
Run of 3? YES YES YES

Continuing in Table A, we did 25 repetitions; 23 of them did have a run of three or more heads or tails. So we estimate the probability of a run by the proportion

estimated probability

Of course, 25 repetitions are not enough to be confident that our estimate is accurate. Now that we understand how to do the simulation, we can tell a computer to do many thousands of repetitions. A long simulation (or hard mathematics) finds that the true probability is about 0.826. Most people think runs are somewhat unlikely, so even our short simulation challenges our intuition by showing that runs of three occur most of the time in 10 tosses.

Once you have gained some experience in simulation, setting up the probability model (Step 1) is usually the hardest part of the process. Although coin tossing may not fascinate you, the model in Example 1 is typical of many probability problems because it consists of independent trials (the tosses) all having the same possible outcomes with the same probabilities. Shooting 10 free throws and observing the sexes of 10 children have similar models and are simulated in much the same way. The new part of the model is independence, which simplifies our work because it allows us to simulate each of the 10 tosses in exactly the same way.

Independence

Two random phenomena are independent if knowing the outcome of one does not change the probabilities for outcomes of the other.

Independence, like all aspects of probability, can be verified only by observing many repetitions. It is plausible that repeated tosses of a coin are independent (the coin has no memory), and observation shows that they are. It seems less plausible that successive shots by a basketball player are independent, but observation shows that they are at least very close to independent.

449

image

Step 2 (assigning digits) rests on the properties of the random digit table. Here are some examples of this step.

EXAMPLE 2 Assigning digits for simulation

In the United States, the Age Discrimination Employment Act (ADEA) forbids age discrimination against people who are age 40 and over. Terminating an “unusually” large proportion of employees age 40 and over can trigger legal action. Simulation can help determine what might be an “unusual” pattern of terminations. How might we set up such a simulation?

  1. (a) Choose one employee at random from a group of which 40% are age 40 and over. One digit simulates one employee:

    0, 1, 2, 3 = age 40 and over

    4, 5, 6, 7, 8, 9 = under age 40

  2. (b) Choose one employee at random from a group of which 43% are age 40 and over. Now two digits simulate one person:

    00, 01, 02, . . . , 42 = age 40 and over

    43, 74, 75, . . . , 99 = under age 40

    We assigned 43 of the 100 two-digit pairs to “age 40 and over” to get probability 0.43. Representing “age 40 and over” by 01, 02, . . . , 43 and “under age 40” by 44, 45, . . . , 99, 00 would also be correct.

    450

  3. (c) Choose one employee at random from a group of which 30% are age 40 and over and have no plans to retire, 10% are age 40 and over and plan to retire in the next few months, and 60% are under age 40. There are now three possible outcomes, but the principle is the same. One digit simulates one person:

    0, 1, 2 = age 40 and over and have no plans to retire

    3 = age 40 and over and plan to retire in the next few months

    4, 5, 6, 7, 8, 9 = under age 40

NOW IT’S YOUR TURN

Question 19.1

19.1 Selecting cards at random. In a standard deck of 52 cards, there are 13 spades, 13 hearts, 13 diamonds, and 13 clubs. How would you assign digits for a simulation to determine the suit (spades, hearts, diamonds, or clubs) of a card chosen at random from a standard deck of 52 cards?