EXAMPLE 19 Calculating conditional probability
Marketing companies are looking to statistical analysis and data mining in an effort to increase the customer response to their promotions. Therefore, these marketing companies are looking to hire more statistics-savvy students. Table 21 is adapted from a study on direct mail marketing. It contains the numbers of custom-ers who either responded or did not respond to a direct mail marketing campaign, along with whether they had a credit card on file with the company. The two events are:
- R:Responded to direct mail marketing campaign
- C:Has a credit card on file
Table 5.54: Table 21 Credit card status and marketing response
Did not respond |
161 |
79 |
240 |
Did respond |
17 |
31 |
48 |
Table 5.54: Source: Data Mining and Predictive Analytics, by Daniel Larose and Chantal Larose, Wiley Interscience, 2015.
- Find the probability that a randomly chosen customer responded to the marketing campaign.
- Calculate the probability that a randomly chosen customer both responded to the marketing campaign and has a credit card on file.
- Find the conditional probability that a randomly selected customer responded, given that the customer has a credit card on file.
- P(R)=N(R)N(S). There are N(R)=48 customers who did respond, and there are N(S)=288 customers in this experiment. Thus,
P(R)=N(R)N(S)=48288≈0.1667
- Here, we are looking for P(R and C), which is the intersection between R and C. Earlier, we learned that this is represented by the cell at the intersection of the “Did respond” row and the “Credit card yes” column. In Table 21, this cell contains N(R and C)=31 such customers. Thus, the probability that a randomly chosen customer both responded to the marketing campaign and has a credit card on file = P(R and C)=31/288=0.1076.
- We will use P(R given C)=P(R|C)=N(R and C)/N(C) because, in this example, it is easier to work directly with the numbers of outcomes instead of the probabilities. From (b), N(R and C)=31. Also, there are N( C)=110 customers in total who had a credit card on file. Therefore,
P(R given C)=P(R|C)=N(R and C)N(C)=31110≈0.2818
That is, the probability that a randomly chosen customer responded to the direct mail marketing campaign, given that the customer had a credit card on file, is 0.2818