probability model
The idea of probability as a proportion of outcomes in very many repeated trials guides our intuition but is hard to express in mathematical form. A description of a random phenomenon in the language of mathematics is called a probability model. To see how to proceed, think first about a very simple random phenomenon, tossing a coin once. When we toss a coin, we cannot know the outcome in advance. What do we know? We are willing to say that the outcome will be either heads or tails. Because the coin appears to be balanced, we believe that each of these outcomes has probability 1/2. This description of coin tossing has two parts:
This two-part description is the starting point for a probability model. We begin by describing the outcomes of a random phenomenon and then learn how to assign these probabilities ourselves.
Sample spaces
A probability model first tells us what outcomes are possible.
Sample Space
The sample space of a random phenomenon is the set of all distinct possible outcomes.
The name “sample space” is natural in random sampling, where each possible outcome is a sample and the sample space contains all possible samples. To specify , we must state what constitutes an individual outcome and then state which outcomes can occur. We often have some freedom in defining the sample space, so the choice of is a matter of convenience as well as correctness. The idea of a sample space, and the freedom we may have in specifying it, are best illustrated by examples.
EXAMPLE 4.3 Sample Space for Tossing a Coin
Toss a coin. There are only two possible outcomes, and the sample space is
or, more briefly, .
EXAMPLE 4.4 Sample Space for Random digits
Type “=RANDBETWEEN(0,9)” into any Excel cell and hit enter. Record the value of the digit that appears in the cell. The possible outcomes are
180
EXAMPLE 4.5 Sample Space for Tossing a Coin Four Times
Toss a coin four times and record the results. That's a bit vague. To be exact, record the results of each of the four tosses in order. A possible outcome is then HTTH. Counting shows that there are 16 possible outcomes. The sample space is the set of all 16 strings of four toss results—that is, strings of H's and T's.
Suppose that our only interest is the number of heads in four tosses. Now we can be exact in a simpler fashion. The random phenomenon is to toss a coin four times and count the number of heads. The sample space contains only five outcomes:
This example illustrates the importance of carefully specifying what constitutes an individual outcome.
Although these examples seem remote from the practice of statistics, the connection is surprisingly close. Suppose that in conducting a marketing survey, you select four people at random from a large population and ask each if he or she has used a given product. The answers are Yes or No. The possible outcomes—the sample space—are exactly as in Example 4.5 if we replace heads by Yes and tails by No. Similarly, the possible outcomes of an SRS of 1500 people are the same in principle as the possible outcomes of tossing a coin 1500 times. One of the great advantages of mathematics is that the essential features of quite different phenomena can be described by the same mathematical model, which, in our case, is the probability model.
The sample spaces considered so far correspond to situations in which there is a finite list of all the possible values. There are other sample spaces in which, theoretically, the list of outcomes is infinite.
EXAMPLE 4.6 Using Software
Most statistical software has a function that will generate a random number between 0 and 1. The sample space is
This is a mathematical idealization with an infinite number of outcomes. In reality, any specific random number generator produces numbers with some limited number of decimal places so that, strictly speaking, not all numbers between 0 and 1 are possible outcomes. For example, in default mode, Excel reports random numbers like 0.798249, with six decimal places. The entire interval from 0 to 1 is easier to think about. It also has the advantage of being a suitable sample space for different software systems that produce random numbers with different numbers of digits.
Apply Your Knowledge
4.14 Describing sample spaces.
In each of the following situations, describe a sample space for the random phenomenon. In some cases, you have some freedom in your choice of .
181
4.15 Describing sample spaces.
In each of the following situations, describe a sample space for the random phenomenon. Explain why, theoretically, a list of all possible outcomes is not finite.
A sample space lists the possible outcomes of a random phenomenon. To complete a mathematical description of the random phenomenon, we must also give the probabilities with which these outcomes occur.
The true long-term proportion of any outcome—say, “exactly two heads in four tosses of a coin”— can be found only empirically, and then only approximately. How then can we describe probability mathematically? Rather than immediately attempting to give “correct” probabilities, let's confront the easier task of laying down rules that any assignment of probabilities must satisfy. We need to assign probabilities not only to single outcomes but also to sets of outcomes.
Event
An event is an outcome or a set of outcomes of a random phenomenon. That is, an event is a subset of the sample space.
EXAMPLE 4.7 Exactly Two Heads in Four Tosses
Take the sample space for four tosses of a coin to be the 16 possible outcomes in the form HTHH. Then “exactly two heads” is an event. Call this event . The event expressed as a set of outcomes is
In a probability model, events have probabilities. What properties must any assignment of probabilities to events have? Here are some basic facts about any probability model. These facts follow from the idea of probability as “the long-run proportion of repetitions on which an event occurs.”
182
Probability rules
Formal probability uses mathematical notation to state Facts 1 to 4 more concisely. We use capital letters near the beginning of the alphabet to denote events. If is any event, we write its probability as . Here are our probability facts in formal language. As you apply these rules, remember that they are just another form of intuitively true facts about long-run proportions.
Probability Rules
Rule 1. The probability of any event satisfies .
Rule 2. If is the sample space in a probability model, then .
Rule 3. Two events and are disjoint if they have no outcomes in common and so can never occur together. If and are disjoint,
This is the addition rule for disjoint events.
Rule 4. The complement of any event is the event that does not occur, written as . The complement rule states that
Venn diagram
You may find it helpful to draw a picture to remind yourself of the meaning of complements and disjoint events. A picture like Figure 4.2 that shows the sample space as a rectangular area and events as areas within is called a Venn diagram. The events and in Figure 4.2 are disjoint because they do not overlap. As Figure 4.3 shows, the complement contains exactly the outcomes that are not in .
183
EXAMPLE 4.8 Favorite Vehicle Colors
What is your favorite color for a vehicle? Our preferences can be related to our personality, our moods, or particular objects. Here is a probability model for color preferences.2
Color | White | Black | Silver | Gray |
Probability | 0.24 | 0.19 | 0.16 | 0.15 |
Color | Red | Blue | Brown | Other |
Probability | 0.10 | 0.07 | 0.05 | 0.04 |
Each probability is between 0 and 1. The probabilities add to 1 because these outcomes together make up the sample space . Our probability model corresponds to selecting a person at random and asking him or her about a favorite color.
Let's use the probability Rules 3 and 4 to find some probabilities for favorite vehicle colors.
EXAMPLE 4.9 Black or Silver?
What is the probability that a person's favorite vehicle color is black or silver? If the favorite is black, it cannot be silver, so these two events are disjoint. Using Rule 3, we find
There is a 35% chance that a randomly selected person will choose black or silver as his or her favorite color. Suppose that we want to find the probability that the favorite color is not blue.
EXAMPLE 4.10 Use the Complement Rule
To solve this problem, we could use Rule 3 and add the probabilities for white, black, silver, gray, red, brown, and other. However, it is easier to use the probability that we have for blue and Rule 4. The event that the favorite is not blue is the complement of the event that the favorite is blue. Using our notation for events, we have
We see that 93% of people have a favorite vehicle color that is not blue.
Apply Your Knowledge
4.16 Red or brown.
Refer to Example 4.8, and find the probability that the favorite color is red or brown.
4.17 White, black, silver, gray, or red.
Refer to Example 4.8, and find the probability that the favorite color is white, black, silver, gray, or red using Rule 4. Explain why this calculation is easier than finding the answer using Rule 3.
4.18 Moving up.
An economist studying economic class mobility finds that the probability that the son of a father in the lowest economic class remains in that class is 0.46. What is the probability that the son moves to one of the higher classes?
184
4.19 Occupational deaths.
Government data on job-related deaths assign a single occupation for each such death that occurs in the United States. The data on occupational deaths in 2012 show that the probability is 0.183 that a randomly chosen death was a construction worker and 0.039 that it was miner. What is the probability that a randomly chosen death was either construction related or mining related? What is the probability that the death was related to some other occupation?
4.20 Grading Canadian health care.
Annually, the Canadian Medical Association uses the marketing research firm Ipsos Canada to measure public opinion with respect to the Canadian health care system. Between July 17 and July 26 of 2013, Ipsos Canada interviewed a random sample of 1000 adults.3 The people in the sample were asked to grade the overall quality of health care services as an A, B, C, or F, where an A is the highest grade and an F is a failing grade. Here are the results:
Outcome | Probability |
---|---|
A | 0.30 |
B | 0.45 |
C | ? |
F | 0.06 |
These proportions are probabilities for choosing an adult at random and asking the person's opinion on the Canadian health care system.
Assigning probabilities: Finite number of outcomes
The individual outcomes of a random phenomenon are always disjoint. So, the addition rule provides a way to assign probabilities to events with more than one outcome: start with probabilities for individual outcomes and add to get probabilities for events. This idea works well when there are only a finite (fixed and limited) number of outcomes.
Probabilities in a Finite Sample Space
Assign a probability to each individual outcome. These probabilities must be numbers between 0 and 1 and must have sum 1.
The probability of any event is the sum of the probabilities of the outcomes making up the event.
CASE 4.1 Uncovering Fraud by Digital Analysis
What is the probability that the leftmost digit (“first digit”) of a multidigit financial number is 9? Many of us would assume the probability to be 1/9. Surprisingly, this is often not the case for legitimately reported financial numbers. It is a striking fact that the first digits of numbers in legitimate records often follow a distribution known as Benford's law. Here it is (note that the first digit can't be 0):
First digit | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
Proportion | 0.301 | 0.176 | 0.125 | 0.097 | 0.079 | 0.067 | 0.058 | 0.051 | 0.046 |
185
It is a regrettable fact that financial fraud permeates business and governmental sectors. In a recent 2014 study, the Association of Certified Fraud Examiners (ACFE) estimates that a typical organization loses 5% of revenues each year to fraud.4 ACFE projects a global fraud loss of nearly $4 trillion. Common examples of business fraud include:
In all these situations, the individual(s) committing fraud are needing to “invent” fake financial entry numbers. In whatever means the invented numbers are created, the first digits of the fictitious numbers will most likely not follow the probabilities given by Benford's law. As such, Benford's law serves as an important “digital analysis” tool of auditors, typically CPA accountants, trained to look for fraudulent behavior.
Of course, not all sets of data follow Benford's law. Numbers that are assigned, such as Social Security numbers, do not. Nor do data with a fixed maximum, such as deductible contributions to individual retirement accounts (IRAs). Nor, of course, do random numbers. But given a remarkable number of financial-related data sets do closely obey Benford's law, its role in auditing of financial and accounting statements cannot be ignored.
EXAMPLE 4.11 Find Some Probabilities for Benford's Law
CASE 4.1 Consider the events
From the table of probabilities in Case 4.1,
Note that is not the same as the probability that a first digit is strictly less than 3. The probability that a first digit is 3 is included in “3 or less” but not in “less than 3.”
Apply Your Knowledge
4.21 Household space heating.
Draw a U.S. household at random, and record the primary source of energy to generate heat for warmth of the household using space-heating equipment. “At random” means that we give every household the same chance to be chosen. That is, we choose an SRS of size 1. Here is the distribution of primary sources for U.S. households:5
186
Primary source | Probability |
---|---|
Natural gas | 0.50 |
Electricity | 0.35 |
Distillate fuel oil | 0.06 |
Liquefied petroleum gases | 0.05 |
Wood | 0.02 |
Other | 0.02 |
4.22 Benford's law.
CASE 4.1 Using the probabilities for Benford's law, find the probability that a first digit is anything other than 4.
4.23 Use the addition rule.
CASE 4.1 Use the addition rule (page 182) with the probabilities for the events and from Example 4.11 to find the probability of or .
EXAMPLE 4.12 Find More Probabilities for Benford's Law
CASE 4.1 Check that the probability of the event that a first digit is even is
Consider again event from Example 4.11 (page 185), which had an associated probability of 0.602. The probability
is not the sum of and because events and are not disjoint. The outcome of 2 is common to both events. Be careful to apply the addition rule only to disjoint events. In Section 4.3, we expand upon the addition rule given in this section to handle the case of nondisjoint events.
Assigning probabilities: Equally likely outcomes
Assigning correct probabilities to individual outcomes often requires long observation of the random phenomenon. In some circumstances, however, we are willing to assume that individual outcomes are equally likely because of some balance in the phenomenon. Ordinary coins have a physical balance that should make heads and tails equally likely, for example, and the table of random digits comes from a deliberate randomization.
EXAMPLE 4.13 First digits That Are Equally Likely
You might think that first digits in business records are distributed “at random” among the digits 1 to 9. The nine possible outcomes would then be equally likely. The sample space for a single digit is
Because the total probability must be 1, the probability of each of the nine outcomes must be 1/9. That is, the assignment of probabilities to outcomes is
First digit | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
Probability | 1/9 | 1/9 | 1/9 | 1/9 | 1/9 | 1/9 | 1/9 | 1/9 | 1/9 |
187
The probability of the event that a randomly chosen first digit is 3 or less is
Compare this with the Benford's law probability in Example 4.11 (page 185). A crook who fakes data by using “random” digits will end up with too few first digits that are 3 or less.
In Example 4.13, all outcomes have the same probability. Because there are nine equally likely outcomes, each must have probability 1/9. Because exactly three of the nine equally likely outcomes are 3 or less, the probability of this event is 3/9. In the special situation in which all outcomes are equally likely, we have a simple rule for assigning probabilities to events.
Equally Likely outcomes
If a random phenomenon has possible outcomes, all equally likely, then each individual outcome has probability . The probability of any event is
Most random phenomena do not have equally likely outcomes, so the general rule for finite sample spaces (page 184) is more important than the special rule for equally likely outcomes.
Apply Your Knowledge
4.24 Possible outcomes for rolling a die.
A die has six sides with one to six spots on the sides. Give the probability distribution for the six possible outcomes that can result when a fair die is rolled.
Independence and the multiplication rule
Rule 3, the addition rule for disjoint events, describes the probability that one or the other of two events and occurs when and cannot occur together. Now we describe the probability that both events and occur, again only in a special situation. More general rules appear in Section 4.3.
Suppose that you toss a balanced coin twice. You are counting heads, so two events of interest are
The events and are not disjoint. They occur together whenever both tosses give heads. We want to compute the probability of the event { and } that both tosses are heads. The Venn diagram in Figure 4.4 illustrates the event { and } as the overlapping area that is common to both and .
The coin tossing of Buffon, Pearson, and Kerrich described in Example 4.2 makes us willing to assign probability 1/2 to a head when we toss a coin. So,
188
What is ? Our common sense says that it is 1/4. The first coin will give a head half the time and then the second will give a head on half of those trials, so both coins will give heads on of all trials in the long run. This reasoning assumes that the second coin still has probability 1/2 of a head after the first has given a head. This is true—we can verify it by tossing two coins many times and observing the proportion of heads on the second toss after the first toss has produced a head. We say that the events “head on the first toss” and “head on the second toss” are independent. Here is our final probability rule.
Multiplication Rule for Independent Events
Rule 5. Two events and are independent if knowing that one occurs does not change the probability that the other occurs. If and are independent,
This is the multiplication rule for independent events.
Our definition of independence is rather informal. We make this informal idea precise in Section 4.3. In practice, though, we rarely need a precise definition of independence because independence is usually assumed as part of a probability model when we want to describe random phenomena that seem to be physically unrelated to each other.
EXAMPLE 4.14 Determining Independence Using the Multiplication Rule
Consider a manufacturer that uses two suppliers for supplying an identical part that enters the production line. Sixty percent of the parts come from one supplier, while the remaining 40% come from the other supplier. Internal quality audits find that there is a 1% chance that a randomly chosen part from the production line is defective. External supplier audits reveal that two parts per 1000 are defective from Supplier 1. Are the events of a part coming from a particular supplier—say, Supplier 1—and a part being defective independent?
Define the two events as follows:
We have and . The product of these probabilities is
However, supplier audits of Supplier 1 indicate that . Given that , we conclude that the supplier and defective part events are not independent.
The multiplication rule holds if and are independent but not otherwise. The addition rule holds if and are disjoint but not otherwise. Resist the temptation to use these simple rules when the circumstances that justify them are not present. You must also be certain not to confuse disjointness and independence. Disjoint events cannot be independent. If and are disjoint, then the fact that occurs tells us that cannot occur—look back at Figure 4.2 (page 182). Thus, disjoint events are not independent. Unlike disjointness, picturing independence with a Venn diagram is not obvious. A mosaic plot introduced in Chapter 2 provides a better way to visualize independence or lack of it. We will see more examples of mosaic plots in Chapter 9.
189
Reminder
mosaic plot, p. 109
Apply Your Knowledge
4.25 High school rank.
Select a first-year college student at random and ask what his or her academic rank was in high school. Here are the probabilities, based on proportions from a large sample survey of first-year students:
Rank | Top 20% | Second 20% | Third 20% | Fourth 20% | Lowest 20% |
Probability | 0.41 | 0.23 | 0.29 | 0.06 | 0.01 |
4.26 College-educated part-time workers?
For people aged 25 years or older, government data show that 34% of employed people have at least four years of college and that 20% of employed people work part-time. Can you conclude that because , about 6.8% of employed people aged 25 years or older are college-educated part-time workers? Explain your answer.
Applying the probability rules
If two events and are independent, then their complements and are also independent and is independent of . Suppose, for example, that 75% of all registered voters in a suburban district are Republicans. If an opinion poll interviews two voters chosen independently, the probability that the first is a Republican and the second is not a Republican is .
The multiplication rule also extends to collections of more than two events, provided that all are independent. Independence of events , , and means that no information about any one or any two can change the probability of the remaining events. The formal definition is a bit messy. Fortunately, independence is usually assumed in setting up a probability model. We can then use the multiplication rule freely.
By combining the rules we have learned, we can compute probabilities for rather complex events. Here is an example.
EXAMPLE 4.15 False Positives in Job Drug Testing
Job applicants in both the public and the private sector are often finding that preemployment drug testing is a requirement. The Society for Human Resource Management found that 71% of larger organizations require drug testing of new job applicants and that 44% of these organizations randomly test hired employees.6 From an applicant's or employee's perspective, one primary concern with drug testing is a “false-positive” result, that is, an indication of drug use when the individual has indeed not used drugs. If a job applicant tests positive, some companies allow the applicant to pay for a retest. For existing employees, a positive result is sometimes followed up with a more sophisticated and expensive test. Beyond cost considerations, there are issues of defamation, wrongful discharge, and emotional distress.
190
The enzyme multiplied immunoassay technique, or EMIT, applied to urine samples is one of the most common tests for illegal drugs because it is fast and inexpensive. Applied to people who are free of illegal drugs, EMIT has been reported to have false-positive rates ranging from 0.2% to 2.5%. If 150 employees are tested and all 150 are free of illegal drugs, what is the probability that at least one false positive will occur, assuming a 0.2% false positive rate?
It is reasonable to assume as part of the probability model that the test results for different individuals are independent. The probability that the test is positive for a single person is 0.2%, or 0.002, so the probability of a negative result is by the complement rule. The probability of at least one false-positive among the 150 people tested is, therefore,
The probability is greater than 1/4 that at least one of the 150 people will test positive for illegal drugs even though no one has taken such drugs.
Apply Your Knowledge
4.27 Misleading résumés.
For more than two decades, Jude Werra, president of an executive recruiting firm, has tracked executive résumés to determine the rate of misrepresenting education credentials and/or employment information. On a biannual basis, Werra reports a now nationally recognized statistic known as the “Liars Index.” In 2013, Werra reported that 18.4% of executive job applicants lied on their résumés.7
4.28 Failing to detect drug use.
In Example 4.15, we considered how drug tests can indicate illegal drug use when no illegal drugs were actually used. Consider now another type of false test result. Suppose an employee is suspected of having used an illegal drug and is given two tests that operate independently of each other. Test A has probability 0.9 of being positive if the illegal drug has been used. Test B has probability 0.8 of being positive if the illegal drug has been used. What is the probability that neither test is positive if the illegal drug has been used?
4.29 Bright lights?
A string of holiday lights contains 20 lights. The lights are wired in series, so that if any light fails the whole string will go dark. Each light has probability 0.02 of failing during a three-year period. The lights fail independently of each other. What is the probability that the string of lights will remain bright for a three-year period?