Essentials of Data Science – Probability and Statistical Inference – Poisson Distribution

In the previous note on Probability and Statistical Inference, we have seen the importance of probability distributions. 

This note will cover the basic intuition behind Poisson distribution, expectation, variance, and other quantitative measures to characterize the random variable. Further, we will also cover various similar phenomena following Poisson distribution.
Using the probability functions of Poisson distribution, we can compute expectation, variance, and other quantitative measures of similar phenomenons.

Condition of a Poisson Distribution

Consider a situation in which the number of events is very large and the probability of success is very small in a fixed time interval.

  • The number of emails arriving in a mailbox in a particular time frame.
  • The number of customers walking into a shop in a day.
  • The number of alpha particles emitted by a radioactive substance entering a particular region in a given short time interval. The number of emitted alpha particles is very high. However, only a few particles are transmitted through the region in a given short time interval.
  • The number of misprints on a page of a book. For instance, suppose that there is a small probability that each letter typed on a page will be misprinted.
  • The number of people in a city living to 100 years of age.
  • The number of wrong telephone numbers that are dialled in a day.

Poisson distribution is a discrete probability distribution that calculates the probability of a given number of events occurring in a given time frame. In addition, the sample space is huge, and the occurrence probability of the event of our interest is very small.

Poisson Distribution

A discrete random variable X is said to follow a Poisson distribution with parameter \lambda > 0 if its Probability Mass Function (PMF) is given by:

P(X = x)  = \begin{cases} \frac{{e^{ - \lambda } \lambda^x }}{x!} & \text{ if } x = 1, 2, 3, \cdots \\ 0 & \text{ Otherwise }  \end{cases}

It is also denoted as:  X \sim P(\lambda)

The mean and variance of a Poisson random variable are identical:

  • E(X) = \lambda
  • Var(X) = \lambda

Note: In the case of Poisson distribution, mean and variance are the same. Moreover, from the data science perspective, when a phenomenon has an identical mean and variance, we may try to approximate the random phenomenon with the Poisson distribution.

Characteristics of Poisson distribution

Poisson distribution characteristics help us decide whether a given random phenomenon can be approximated with Poisson distribution or not.

  • The Poisson distribution follows a discrete probability distribution, so the outcomes should be discrete. 
  • The number of occurrences in each interval can range from zero to infinity.
  • The random phenomenon describes the distribution of rare events, all events are independent of each other, and discrete events are described over an interval.
  • The expected number of occurrences E(X) is assumed constant throughout the experiment.

The Poisson random variable has a wide range of applications in a variety of areas because it may be used as an approximation for a Binomial random variable with parameters (n,p) when n is large and p is small. 

For examples:

  • Suppose a random phenomenon consists of n independent trials, each of which results in success with probability p, are performed. When n is significant and p is small, the number of successes occurring is approximately a Poisson random variable with mean \lambda = np.
  • Suppose a random phenomenon consists of finding misprints on a page of the given book. Let us consider that the probability (p) refers to each letter typed on a page in a book will be misprinted. The number of misprints on a given page will be approximately Poisson with mean \lambda = np, where n is many letters on that page. 
  • Suppose that each person in a given community independently has a small probability p of reaching the age of 100, and the number of people that do will have approximately a poison distribution with mean \lambda = np where n is a large number of people in the community. 

Example 1: Suppose a country experiences four earthquakes on average per year. The probability of suffering from only two earthquakes is obtained as follows by using the Poisson distribution.

Here mean \lambda = 4 and x = 2.

\begin{aligned} P(X = 2) &= \frac{{e^{ - \lambda } \lambda^x }}{x!} \\ &= \frac{{e^{ - 4} 4^2 }}{2!} \\ &= 0.1465 \end{aligned}

In the above example, earthquakes occur for a very short interval compared to the number of intervals approximately to that size of earthquake duration is enormous. We may assume that the Poisson distribution can approximate this random phenomenon well.

Example 2: Suppose that the average number of accidents occurring weekly on a particular stretch of a highway equals three. The probability that there is at least one accident this week is as follows:

Here mean \lambda = 3, \text{ and }x \geq 1

\begin{aligned} P(X \geq 1) &= 1 - P(X == 0) \\ &=1 - \frac{{e^{ - \lambda } \lambda^x }}{x!} \\ &= 1 -  \frac{{e^{ - 3} 3^0 }}{0!} \\ &= .95 \end{aligned}

Example 3: Suppose the probability that an item is defective is 0.1. Assuming that the quality of successive items is independent, the probability that a sample of 10 items will contain at most one defective item can be obtained by binomial as well as Poisson distributions as follows:

Here p = 0.1, n = 10, \lambda = np = 1

Using Binomial distribution:

\begin{aligned} P(X = 0) + P(X= 1) &= \binom{10}{0} 0.1^0 (1-0.1)^{10-0} + \binom{10}{1} 0.1^1 (1-0.1)^{10-1} \\ &= .73 \end{aligned}

Using Poisson distribution:

\begin{aligned} P(X = 0) + P(X=1) &= \frac{{e^{ - 1} 1^0 }}{0!} + \frac{{e^{ - 1} 1^1 }}{1!} \\ &= 0.73 \end{aligned}

Properties of Poisson Distribution

Additivity Property: The Poisson distribution possesses the additivity property that the sum of independent Poisson random variables is also a Poisson random variable.

For example, suppose that X_1 and X_2 are independent Poisson random variables having respective means \lambda_1 and \lambda_2. Then (X_1 + X_2) are distributed as Poisson distribution with means (\lambda_1 + \lambda_2).

Recurrence Relationship: 

  • If P(X = i)  = \frac{{e^{ - 1} 1^i }}{i!} then P(X = i + 1) = \frac{\lambda}{i + 1} P(X = i)

References

  1. Essentials of Data Science With R Software – 1: Probability and Statistical Inference, By Prof. Shalabh, Dept. of Mathematics and Statistics, IIT Kanpur.

CITE THIS AS:

“Probability and Statistical Inference – Introduction to Poisson Probability Distribution”  From NotePub.io – Publish & Share Note! https://notepub.io/notes/mathematics/statistics/statistical-inference-for-data-science/poisson-distribution/

 12,285 total views,  1 views today

Scroll to Top
Scroll to Top