dm_chapter

6.2 The Binomial Probability Model

Many applications of probability involve considering situations where an experiment has only two possible outcomes. A flip of a coin produces either heads or tails. A child is either male or female. A household either owns a dog or it does not. A voter selects a certain candidate or not.

These situations occur so frequently that mathematicians have studied them in great detail. We call each repetition of the experiment a trial; and we call these situations binomial experiments if four conditions are true:

Each trial has only two outcomes; to simplify discussion, we call one outcome success and the other outcome failure.
The probability of success remains the same from trial to trial. We denote the probability of success as p. (Because there are only two outcomes, this means that the probability of failure is 1 – p.)
The trials are independent; what happens on one trial does not influence what happens on any other trial.
The experiment is performed a specific number of times; we use n to denote the number of trials.

The trials in a binomial experiment are also called Bernoulli trials. Bernoulli trials are named for a famous Swiss family, but different sources attribute this particular definition to different members of the family. It is sometimes hard to tell who really did what among the Bernoullis, because family members worked very successfully in mathematics and physics for three generations, and they sometimes tried to take credit for each other’s work.

In the previous section, we looked at the sex of children in a three-child family. This is a binomial experiment with each child constituting at “trial,” because it satisfies the required conditions.

Each trial has only two outcomes; each child is either a boy or a girl. We will call the child being a boy success, and being a girl failure (although things work out just the same if "girl" is success).
The probability of success remains the same from trial to trial; the probability of being a boy is 0.5 for each child in the family.
The trials are independent; whether the first child is a boy does not influence the sex of the later children.
The experiment is performed a specific number of times; there are three children and thus, three trials.

To further explore the characteristics of a binomial experiment, view the video StatTutor: Binomial Setting.

Now Try This 6.5

Determine whether each of the following experiments is or is not a binomial experiment. If it is, identify what constitutes success in the experiment.

(1) A student rolls a die 20 times and records whether or not a five appears on the die.

(2) A student rolls a die 20 times and records the number on the die each time.

(3) A student tosses a coin, recording how many tosses until the first head occurs.

(4) A student tosses a coin 50 times and records whether heads or tails occurs.

(5) Recording the eye color for each student in your statistics class.

(6) Recording whether or not each student in your statistics class has blue eyes.

Try again.

Correct.

(1) This is a binomial experiment; success is getting a 5.
(2) This is not a binomial experiment; there are six possible outcomes of each trial.
(3) This is not a binomial experiment, because the number of trials is not fixed; the experiment continues until a head occurs.
(4) This is a binomial experiment; either getting a head or getting a tail can be considered success.
(5) This is not a binomial experiment; there are more than two possible outcomes.
(6) This is a binomial experiment; having blue eyes is considered success. (We would have to do some research to determine the value of p.)

Incorrect.

(1) This is a binomial experiment; success is getting a 5.
(2) This is not a binomial experiment; there are six possible outcomes of each trial.
(3) This is not a binomial experiment, because the number of trials is not fixed; the experiment continues until a head occurs.
(4) This is a binomial experiment; either getting a head or getting a tail can be considered success.
(5) This is not a binomial experiment; there are more than two possible outcomes.
(6) This is a binomial experiment; having blue eyes is considered success. (We would have to do some research to determine the value of p.)

When we perform a binomial experiment and record the number X of successes that occur, then X is a discrete random variable, and we say that X has a binomial probability model. The pattern of probability values that occurs is determined by two numbers, the number of trials we perform (we call this number n) and the probability of success (p). If X represents the number of boys in a three-child family, then n = 3 and p = 0.5. The binomial experiment of tossing a coin 3 times and recording the number of heads has the same probability model. In Section 6.1, we constructed this probability model by using the tree diagram for the sex of children in a 3-child family. We give this model again here, using X to indicate the number of either boys or heads, and P(X) to indicate the corresponding probability.

Table 6.6 The Probability Model for the Number of Boys in a Three-Child Family or the Number of Heads in Three Tosses of a Coin

X	0	1	2	3
P(X)	0.125	0.375	0.375	0.125

The probability of success (boy or head) is the same as the probability of failure (girl or tail) in this case. Both p and 1 – p are 0.5. Therefore, the probability model is the same if we count boys or girls, heads or tails.

6.2.1 Finding Binomial Probabilities By Hand

While making a tree diagram was a reasonable tool to determine the probabilities associated with the number of boys in a three-child family, this approach becomes more cumbersome as we increase the number of trials. If we added two more children to our theoretical family, we would have to add many more branches to our tree. Surely, there is a better way.

The “better way” involves a fairly ugly looking mathematical formula. It may not seem better at first glance, but the advantage is that statistical software can easily do the necessary calculations.

If we have a binomial experiment having n trials, with probability of success p, and we let X represent the number of successes in these n trials, then

$P(X = k) = \binom{n}{k} p^{k} (1-p)^{n-k}$

where k is a whole number between 0 and n inclusive.

Let’s deconstruct the formula to make it a bit more understandable:

$\binom{n}{k}$ is called the binomial coefficient and is shorthand that tells us how many different ways the k successes can be arranged in n trials. (If you have one boy in a three child family, there are 3 different ways for this to occur. The boy can be the oldest child, the middle child, or the youngest child.)
$\binom{n}{k} = \frac{n!}{k!(n - k)!}$ , where the exclamation point indicates a factorial. n! is the product of all the whole numbers starting at 1 and ending at n. When n=0, n! is defined to be 1.
$p^{k} (1-p)^{n-k}$ is the probability of k successes and n - k failures occurring in some particular way. (For example, the first child is a boy, the second child is a girl, the third child is a girl.)

If we have k successes, $p^{k} (1-p)^{n-k}$ is the same number regardless of the order in which the successes occur. So our formula finds the probability for one such way $(p^{k} (1-p)^{n-k})$ and then multiplies that result by the total number of ways those successes can occur $\binom{n}{k}$ .

Now we can try this out on our example of boys in a three-child family. To find the probability of 2 boys in a three-child family, we use

$P(X = 2) = \binom{3}{2} 0.5^{2} (1-0.5)^{3-2}$ .

And so,

$P(X = 2) = \frac{3!}{2!(3-2)!} 0.5^{2} (1-0.5)^{3-2}$

$= \frac{3 \times 2 \times 1}{(2 \times 1)(1)} 0.25 \times 0.5$

$= 3 \times 0.125 = 0.375$

Happily, this is the same value that we obtained from the tree diagram and showed in the probability model above.

Now Try This 6.6

A tetrahedral die has four triangular faces, labeled A, B, C, and D. All faces are equally likely when the die is rolled. A student rolls the die 4 times. If X represents the number of times B appears, find P(X = 2). Round your answer to two decimal places.

Try again.

Correct. Since all four faces are equally likely, the probability of any one of them occurring is 0.25. Specifically then, the probability of success (a B) is p=0.25.

$P(X = 2) = \frac{4!}{2!(4-2)!} 0.25^{2} (1 - 0.25)^{4-2}$

$= \frac{4 \times 3 \times 2 \times 1}{(2 \times 1)(2 \times 1)} 0.0625 \times 0.5625$

= 0.2109375 (saving all the decimal places) or 0.21 rounded to two decimal places.

Incorrect. Since all four faces are equally likely, the probability of any one of them occurring is 0.25. Specifically then, the probability of success (a B) is p=0.25.

$P(X = 2) = \frac{4!}{2!(4-2)!} 0.25^{2} (1 - 0.25)^{4-2}$

$= \frac{4 \times 3 \times 2 \times 1}{(2 \times 1)(2 \times 1)} 0.0625 \times 0.5625$

= 0.2109375 (saving all the decimal places) or 0.21 rounded to two decimal places.

6.2.2 Generating a Binomial Probability Model

Finding a single binomial probability using the formula may be manageable and even somewhat novel. But none of us is likely to enjoy repeating this procedure over and over to create a table of probabilities for a binomial experiment with even a reasonably small number of trials. Before the wide availability of computer and calculator software that easily finds these numbers, reference works gave tables of binomial probabilities for various values of n and p. These tables can still be found on the web; if you search for “table of binomial probabilities,” you are likely to find many.

The table below shows a portion of such a table for a binomial experiment with 4 trials. The lefthand column shows the possible number of successes (0, 1, 2, 3, or 4). Each of the remaining columns shows the corresponding probabilities for a different value of p.

Table 6.7 A Binomial Probability Table for an Experiement with 4 Trials

	Probability of Success, p
Number of Successes, k	.10	.15	.20	.25	.30	.35	.40	.45	.50
0	.6561	.5220	.4096	.3164	.2401	.1785	.1296	.0915	.0625
1	.2916	.3685	.4096	.4219	.4116	.3845	.3456	.2995	.2500
2	.0486	.0975	.1536	.2109	.2646	.3105	.3456	.3675	.3750
3	.0036	.0115	.0256	.0469	.0756	.1115	.1536	.2005	.2500
4	.0001	.0005	.0016	.0039	.0081	.0150	.0256	.0410	.0625

If we pull out the column representing the tetrahedral dice example above, where p = 0.25, we obtain the binomial probability model below. Notice that the value for P(X = 2) is the same (rounded to 4 decimal places) as the value we calculated using the formula.

Table 6.8 The Binomial Probability Model for Four Rolls of Tetrahedral Die

X	0	1	2	3	4
P(X)	0.3164	0.4219	0.2109	0.0469	0.0039

Even using these tables has its drawbacks, however. These tables typically use a limited number of trials (n), and they generally show only certain values for p. If p were, for example, 0.22, and n were 8, how would we proceed?

Fortunately, most statistical software, either on a computer or a calculator, can easily compute binomial probability values. The process for doing this varies from software to software. The table below was generated using CrunchIt!, finding each (3 decimal place) probability separately by entering n and p, and then changing the value of X for each number of successes (0 through 6).

Table 6.9 A Binomial Probabilty Model Obtained Using Statistical Software

X	0	1	2	3	4	5	6
P(X)	0.063	0.220	0.323	0.253	0.112	0.026	0.003

Some calculators can generate a whole table at once. You should find the technology that works best for you, and learn to use it well.

Now Try This 6.7

According to the most recent census data reported by Canada’s national statistics agency, Statistics Canada, 23% of all Canadians reported that French is their primary language. Suppose that 5 Canadians were selected at random, with X representing the number of them who report that French is their primary language. Use statistical software to complete the table below for the probability model for this binomial experiment. Round values to 3 decimal places.

X	0	1	2	3	4	5
P(X)

Try again.

Correct.

X	0	1	2	3	4	5
P(X)	0.271	0.404	0.242	0.072	0.011	0.001

Incorrect.

X	0	1	2	3	4	5
P(X)	0.271	0.404	0.242	0.072	0.011	0.001

6.2.3 Finding Specific Binomial Probabilities

We now return to a question we posed at the beginning of Chapter 6--what is the probability that you get at least 7 out of 10 correct on a multiple-choice pop quiz if you guess at each answer? Many multiple-choice questions have four answers. If you guess at an answer, your probability of success, p, is ¼ or 0.25. Each question constitutes a trial, so n = 10. This situation is a binomial experiment, with n = 10 and p = 0.25 as shown in the accompanying table.

Table 6.10 The Binomial Probability Model for n=10 and p=0.25

X	0	1	2	3	4	5	6	7	8	9	10
P(X)	0.05631	0.18771	0.28157	0.25028	0.14600	0.05840	0.01622	0.00309	0.00039	0.00003	0.00000

We can see from this table that the probability of exactly 7 successes (answers correct) is 0.00039.

(We have used 5 decimal places here to limit the number of “essentially zero” probabilities that appear. In this case, “.00000” gives the probability of 10 successes as a value rounded to 5 decimal places. It definitely does not mean that the probability is exactly 0 nor that 10 successes cannot occur.)

But what about the probability of getting at least 7 correct? “At least 7” correct means 7 or more, that is, 7 or 8 or 9 or 10 correct. Because these numbers of successes represent disjoint events, the probability of the number of successes being 7 or 8 or 9 or 10 is the sum of their individual probabilities. That is, P(X ≥ 7) = 0.00309 + 0.00039 + 0.00003 + 0.00000 = 0.00351. This probability suggests that guessing is very unlikely to produce a passing score on the quiz, assuming that 70% is required to pass.

What is the probability that a student gets at most 5 correct on the quiz by guessing? “At most 5” means 5 or fewer correct, that is, 0, 1, 2, 3, 4, or 5. Once again, the desired probability is the sum of the individual probabilities. So P(X ≤ 5) = 0.05631 + 0.18771 + 0.28157 + 0.25028 + 0.14600 + 0.05840 = 0.98027.

For more guidance on finding binomial probabilities, watch the video StatTutor: Binomial Probabilities.

Now Try This 6.8

Use the probability model for the Canadian French primary language experiment above to determine each probability. Round each answer to three decimal places.

(1) The probability that exactly 2 of the 5 randomly selected Canadians claim French as their primary language.

(2) The probability that at least 2 of the 5 randomly selected Canadians claim French as their primary language.

(3) The probability that at most 2 of the 5 randomly selected children Canadians claim French as their primary language.

Try again.

Correct.

(1) P(X=2) = 0.242
(2) P(X≥2) = 0.242 + 0.072 + 0.011 + 0.001=0.326
(3) P(X≤2) = 0.271 + 0.404 + 0.242 = 0.917

Incorrect.

(1) P(X=2) = 0.242
(2) P(X≥2) = 0.242 + 0.072 + 0.011 + 0.001=0.326
(3) P(X≤2) = 0.271 + 0.404 + 0.242 = 0.917

6.2.4 Binomial Mean and Standard Deviation

Binomial random variables are a particular type of discrete random variable. A binomial random variable, X, is the number of successes in the n trials of the binomial experiment. The formulas for mean and standard deviation presented in Section 6.1 apply here as well.

The mean or expected value of any discrete random variable X is

$\mu_{x} = \sum x \cdot P(x)$ ,

and its standard deviation is

$\sigma _{x} = \sqrt{\sum (x - \mu_x)^2 P(x)}$ .

Let’s look for the last time at the three-child family experiment, where X represents the number of boys in the family. The probability model for this experiment is shown below.

Table 6.11 The Probability Model for the Number of Boys in a Three-Child Family One Last Time

X	0	1	2	3
P(X)	0.125	0.375	0.375	0.125

To find the mean or expected value of X, we calculate

$\mu_x = 0 \times 0.125 + 1 \times 0.375 + 2 \times 0.375 + 3 \times 0.125 = 0.375 + 0.750 + 0.375 = 1.500$ .

Perhaps you notice something interesting about this value. For this experiment, n is 3 and p is 0.5, and the product of those numbers is 1.5. Is this a happy coincidence? As it turns out, it is not. The mean (expected value) of the binomial probability model is always $\mu_x = n \times p$ , where n is the number of trials and p is the probability of success. This greatly simplifies our calculation, particularly when n is large (and we have many possible values of X) or the values of P(X) are reported to many decimal places.

Recall that the mean represents what the average number of boys per family should be if we randomly select many, many three-child families and record how many boys each contains.

To calculate the standard deviation for a binomial probability model when n = 3 and p = 0.5, we use

$\sigma_{x} = \sqrt{(0 - 1.5)^2 (0.125) + (1- 1.5)^2 (0.375) + (2-1.5)^2 (0.375) + (3 - 1.5) (0.125)}$

$= \sqrt{(2.25) (0.125) + (0.25) (0.375) + (0.25) (0.375) + (2.25) (0.125)} = \sqrt{0.750} = 0.866$

rounded to three decimal places.

While it is probably not obvious from the above calculation, the value under the square sign is actually $n \times p \times (1-p)$ . So this makes the formula for the standard deviation of the binomial random variable X look much simpler:

$\sigma_x = \sqrt{n \times p \times (1-p)}$ .

For more about the mean and standard deviation of a binomial distribution, watch the video StatTutor: Binomial Mean and Standard Deviation.

As we saw in the last section, if the binomial probability model is bell-shaped (as this one is), then the Empirical Rule applies. So about 68% of the time, we expect the number of boys in a three-child family to lie within one standard deviation of the mean, that is, between 0.634 and 2.366. Effectively, about 68% of the time, there will be one or two boys in a three-child family.

Now Try This 6.9

Based on a 2008 survey, Parade magazine reported that 24% of married people had kept an important secret from their spouse. Suppose that we take a random sample of 10 married persons, and record X, the number who have kept an important secret from their spouse.

(1) Find the mean and standard deviation of X.
Mean: Standard Deviation:

(2) Can you use these values to describe the variability in the values of X? (Note that this probability model is not bell-shaped.)

Try again.

Correct.

(1) $\mu_x = 10(0.24) = 2.4$ and $\sigma_x = \sqrt{10 \times 0.24 \times 0.76} = 1.351$

(2) The mean tells us that if we take many, many random samples of 10 married people, on average, 2.4 people per sample will have kept an important secret from their spouse. Because the model is not bell-shaped (it is right-skewed), we cannot use the Empirical Rule to describe the variability in the values of X.

Incorrect.

(1) $\mu_x = 10(0.24) = 2.4$ and $\sigma_x = \sqrt{10 \times 0.24 \times 0.76} = 1.351$

Because we are so often interested in counting the number of successes that occur in a series of trials, we are frequently considering binomial experiments. Agreeing or disagreeing with a survey question, having an IQ higher than 120 or not, selecting an M&M’s candy that is red (or a different specific color) are all situations that might lead to a binomial model. Before you proceed to use binomial probability techniques, however, it is important for you to verify that your situation satisfies all four conditions for a binomial experiment.

Not all discrete probability models are binomial ones. And not all probability models are discrete. In the next chapter, we will consider continuous probability models in general, and a particular model that is probably the most famous (or infamous, depending of your perspective) probability model of all.

●

◌

▣