Essentials of Data Science – Probability and Statistical Inference – Conditional Probability

In the previous note on the principle of counting, we have learned and understood the need for counting and different tools and techniques to count how many ways an event can happen. In this note, we will continue learning process to computer different types of probabilities and one among them is the conditional probability.

Intuitive notion of Conditional Probability

It is useful in calculating probabilities when some partial information concerning the result of the experiment is available. or in recalculating them in light of additional information. In such a situations, the desired probabilities are conditional ones. 

Sometimes it is often the easiest way to compute the probability of an event is to find condition on the occurrence or non-occurrence of a secondary event. 

For example, suppose a blood test is developed to diagnose a particular infection. The blood test is conducted over 100 randomly selected persons. The outcomes of the absolute and relative frequencies are presented in following tables:

Conditional Probability - Example
Absolute Frequency Table
Relative Frequency Table
Relative Frequency Table

The absolute frequencies are presented using the above table. In which it shows that, test is conducted and result shows that infection is present in 30 peoples out of 40 and absent in 45 peoples out of 60. 

There are the following four possible outcomes:

  • The blood sample has an infection and the test diagnoses it, i.e. the test is correctly diagnosing the infection.
  • The blood sample does not has any infection and the test does not diagnose it, i.e. the test is correctly diagnosing that there is no infection. 
  • The blood sample has an infection and the test does not diagnose it, i.e. the test is incorrect in stating that there is no infection.
  • The blood sample does not has any infection but the test diagnoses it, i.e. the test is incorrect in stating that there is an infection. 

If one already knows that the test is positive and wants to determine the probability that the infection is indeed present, then this can be achieved by the respective conditional probability P(IP|T+) which is:

P(IP|T+) = \frac{P(IP \cap T+)}{P(T+)} = \frac{0.3}{0.4} = 0.75

Note that IP \cap T+ denotes the relative frequency of blood samples in which the disease is present and the test is positive which is 0.3. 

Definition of Conditional Probability

Let P(A) > 0, then the conditional probability of event B occurring, given that event A has already occurred, is 

P(B|A) = \frac{P(A \cap B)}{P(A)}

The roles of A and B can be interchanged to define P(A|B) as follows.

Let P(B) > 0, the conditional probability of A given B is:

P(A|B) = \frac{P(A \cap B)}{P(B)}

The definition of conditional probability is consistent with the interpretation of probability as being a long-run relative frequency, i.e., a large number n of repetitions of the experiment are performed. 

Example 1: 

A coin is tossed twice. If we assume that all four outcomes in the sample space \Omega = {(H,H), (H,T), (T,H),(T,T)} are equally likely, what is the conditional probability that both tosses result in heads, given that the first toss results is head? 

Solution:

If A = {(H,H)} denotes the event that both tosses results in heads, and B = {(H,H),(H,T)} the event that the first toss results in head, then the desired probability is:

P(A|B) = \frac{(A \cap B)}{P(A)} = \frac{P({(H,H)})}{P({(H,T),(H,H)})} = \frac{\frac{1}{4}}{\frac{2}{4}} = \frac{1}{2}

Example 2:

An urn contains 10 white, 5 yellow, and 10 black marbles. A marble is chosen at random from the urn, and it is noted that it is not one of the black marbles. What is the probability that it is yellow? 

Solution:

Let Y denote the event that the marble selected is yellow, and \bar{B} denote the event that it is not black. The desired probability is P(Y| \bar{B}) = \frac{P(Y \cap \bar{B})}{P(\bar{B})}

Here Y| \bar{B} = Y, since the marble will be both yellow and not black if and only if it is yellow. Hence, assuming that each of the 25 marbles is equally likely to be chosen, we obtain that

P(Y| \bar{B}) = \frac{P(Y \cap \bar{B})}{P(\bar{B})} = \frac{\frac{5}{25}}{\frac{15}{25}}  = \frac{1}{3}

Example 3:

A box contains 5 defective, 10 partially defective (that fail after a couple of hours of use), and 25 acceptable (non-defective) transistors. A transistor is chosen at random from the box and put into use. If it does not immediately fail, what is the probability it is acceptable?

Solution:

Given that the transistor did not immediately fail, we know that it is not one of the 5 defectives, so the desired probability is:

P(\text{acceptable | non defective}) = \frac{P(\text{acceptable, non defective}) }{P(\text{non defective})} = \frac{\frac{25}{40}}{\frac{35}{40}} = \frac{5}{7}

Q&A

From this note, we can get answers to the following questions.

  • What is conditional probability?
  • When to use conditional probability?

References

  1. Essentials of Data Science With R Software – 1: Probability and Statistical Inference, By Prof. Shalabh, Dept. of Mathematics and Statistics, IIT Kanpur.

 131 total views,  1 views today

Scroll to Top
Scroll to Top
%d bloggers like this: