Essentials of Data Science – Probability and Statistical Inference – Skewness and Kurtosis

In the previous note on the Probability and Statistical Inference, we have learned expectations and moments for the probability distribution of a random variable which gives central tendency and variability of the values of a random variable, respectively.

In this note, we will further extend the concept of moments and study the other characteristics precisely the shape and peakedness, of the probability distribution of a random variable.

Introduction

Moments are used to describe different characteristics and features of a probability distribution of a random variable. These characteristics are central tendencydispersionsymmetry, and peakedness of probability curve or distribution.

Moment

Suppose X is a random variable, and  g(X) is a real valued function of a random variable X. Let  g(X) = (X-A)^r where r is nonnegative integer, then E[g(X)] = E(X - A)^r is called as r^{th} moment of X about the point A.

General observations:

  • When A = E(X) and r = 1, then, E[(X-A)] = E[X - E(X)] = \mu^1, where, \mu^1 is called as first central moment of X and it is always zero.
  • When A = E(X) and r = 2, then, E[(X-A)^2] = E[X - E(X)]^2 = \mu^2 is called the variance of X. It measures the variability of probability distribution of a random variable.
  • When A = E(X) and r = 3, then, E[(X-A)^3] = E[X - E(X)]^3 = \mu^3 . It helps in determining the skewness of the probability distribution of X.
  • When A = E(X) and r = 4, then, E[(X-A)^4] = E[X - E(X)]^4 = \mu^4 . It helps in determining the peakedness of the probability distribution of X.

Expectation

Expectation measures the central tendency of the probability distribution.

Variance

Variance measures the dispersion or variability in the probability distribution.

It may be possible that two different distributions have the same expectation and variance. But, knowing the expectation and variance, we can’t predict the probability distribution as these measures do not uniquely identify a probability distribution.

Skewness

To study the symmetry of the probability distribution, whether it is more concentrated on the left, right, or uniformly concentrated on the center of the probability distribution. This feature is called skewness. In order to quantify it, we have something called the coefficient of skewness

The literal meaning of skewness is lack of symmetry, and it gives an idea about the shape of the curve obtained by probability distribution. It shows the nature and concentration of observations towards higher or lower values of a random variable.

A probability distribution is said to be skewed if the curve of the distribution is not bell-shaped and is stretched more to one side than to the other. 

  • The probability distribution for which the curve has a longer tail towards the left-hand side is negatively skewed.
  • The probability distribution for which the curve has a longer tail towards the right-hand side is positively skewed.
  • The probability distribution for which the curve is equally distributed on both left and right is zero skewness.

The data findings are mainly categorized into three categories, positively skewed, negatively skewed, and zero skewness. 

Symmetry and Peakedness of curve
Negatively and Positively Skewness

A negatively skewed probability curve is a type of curve in which most values are clustered around the right tail of the curve while the left tail of the curve is longer. Whereas, in the case of a positively skewed probability curve, most values are clustered around the left tail of the curve while the right tail of the curve is longer.

The coefficient of skewness measures the skewness of a probability distribution. It is based on the notion of the moment of the probability distribution. This coefficient is one of the measures of skewness, and it can be measured using any of the measures of central tendency. We will see how to calculate the coefficient of skewness using mean, median, mode, quantiles, and percentiles.

Coefficient of Skewness

\beta_1 = \frac{\mu^2_3}{\mu^3_2}

Where \mu_2 and \mu_3 are the second and third central moments respectively. \beta_1 measures the magnitude only. To measure both mangitude as well as signs as positive (+) or negative (-). From the below equation, we can conclude that sign depends on \mu_3.

\gamma_1 = \pm \sqrt{ \beta_1} = \frac{\mu_3}{\sqrt{\mu^3_2}}

Interpretations:

  • If \gamma_1 = 0, it means the distribution is symmetric or in other words, it is normally distributed or zero skewed.
  • If \gamma_1  > 0, it means the distribution is positively skewed.
  • If \gamma_1 = 0, it means the distribution is negatively skewed.

We can easily see that whether the distribution is symmetric or not symmetric. If not symmetric, then seeing the value of \gamma_1, we get the information about whether distribution is positively or negatively skewed.

When the distribution is positively or negatively skewed, the measures of central tendencies such as mean, median, and mode would be different.

Kurtosis

It describes the peakedness or flatness of a probability distribution of a random variable X. The flatness means, how flat is the curve at the peak. 

Kurtosis
Kurtosis

The peakedness of a probability distribution measures with respect to the peakedness of a normal distribution. It means if the curve is normally distributed, then the Kurtosis value will be zero. Whereas in other cases, it must have non-zero values.

More about Normal Distribution:

Normal Distribution N(\mu, \sigma^2) and its probability distribution function is as follows:

f(x) = \frac{1}{(2 \pi \sigma^2)^\frac{1}{2}} \exp(-\frac{(x-\mu)^2}{2 \sigma^2}) ; -\infty < x < \infty

where \mu, \sigma^2 are the parameters and \mu represents mean and \sigma^2 represents variance.

Properties of Normal distribution:

  • Bell shaped curve
  • Symmetric around mean
  • Skewness = 0
  • Kurtosis = 0

Kurtosis examines the hump or flatness of the given curve or distribution with respect to the hump or flatness of the normal distribution. Here shape of the hump of the normal distribution has been accepted as a standard.

  • Curves with a hump-like of normal distribution curve are called mesokurtic.
  • Curves with greater peakedness or less flatness than of normal distribution curve are called leptokurtic.
  • Curves with less peakedness or greater flatness than the normal distribution curve are called platykurtic.

Quantify the Peakedness using Coefficient of Kurtosis

We can quantify the peakedness using the coefficient of Kurtosis. However, there are different types of coefficient of Kurtosis, and one of them is known as Karl Pearson’s coefficient of Kurtosis. It is represented as follows:

\beta_2 = \frac{\mu_4}{\mu^2_2}

Where \mu_2 and \mu_4 are the second and fourth central moments respectively. \beta_2 measures the magnitude only. To measure both magnitudes as well as signs as positive (+) or negative (-). From the below equation, we can conclude that sign depends on the subtraction of \beta_2 with number three. Here number three is the \beta_2 value of normal distribution as the normal distribution peakedness is the reference for comparison.

\gamma_2 = \beta_2 - 3

Interpretations:

  • For normal distribution or mesokurtic distirbution, \beta_2 = 3 and  \gamma_2 = 0.
  • For leptokurtic distribution, \beta_2  > 3, and \gamma_2  > 0.
  • For platykurtic distirbution, \beta_2 or < 3 and \gamma_2 < 0.

References

  1. Essentials of Data Science With R Software – 1: Probability and Statistical Inference, By Prof. Shalabh, Dept. of Mathematics and Statistics, IIT Kanpur.

CITE THIS AS:

“Probability and Statistical Inference – Skewness and Kurtosis”  From NotePub.io – Publish & Share Note! https://notepub.io/notes/mathematics/statistics/statistical-inference-for-data-science/skewness-and-kurtosis/

 210 total views,  1 views today

Scroll to Top
Scroll to Top
%d bloggers like this: