Descriptive Statistics – Raw and Central Moments

In the earlier note of descriptive statistics, we have introduced the moment and its significance. In this note, we will understand moments about arbitrary points and then specific to statistic of data.

Moments about Arbitrary Point A

Data are usually categorized into two categories, discrete and continuous, and both these types of data are handled differently. In this section, we will see how to deal with discrete and continuous data. The r^{th} moment of a variable X about any arbitrary point A is obtained as follows:

For ungrouped or discrete data

Suppose we have observations x_1, x_2, x_3, ... , x_n  on a variable X is defined as:

\mu{^\prime}_{r} = \frac {1}{n} \sum_{i=1}^n { {(x_i - A)}^r }

For grouped or continous data

Suppose we have observations on a variable X and having k class intervals such as  e_1 - e_2, e_2 - e_3, .... e_{k-1} - e_k in a frequency table. The midpoint value is obtained for each interval is as follows:

 x_i = \frac{e_i + e_j}{2} , where i < j

and associated absolute frequency is f_i for the class interval  e_i - e_j . The f_i represents a number of observations belong to the class interval  e_i - e_j . The sum of all the absolute frequencies must be n = \sum_{i=1}^k {f_i}.

\mu{^\prime}_{r} = \frac {1}{n} \sum_{i=1}^k {f_i \times {(x_i - A)}^r }

Raw Moments

The r^{th} moment around the origin (when A = 0) is called as raw moment and is defined as follows:

Equations for raw moment for grouped and ungrouped data

For ungrouped data it is represented as follows:

\mu{^\prime}_{r} = \frac {1}{n} \sum_{i=1}^n { {(x_i)}^r }

For grouped data it is represented as follows:

\mu{^\prime}_{r} = \frac {1}{n} \sum_{i=1}^k {f_i \times {(x_i )}^r }

Raw Moments – first and second raw moments

The first raw moment is an arithmetic mean, and it is computed by keeping the value of r = 1, for example, in the case of discrete data.

\mu{^\prime}_{1} = \frac {1}{n} \sum_{i=1}^n {{x_i}}

and it is nothing but arithmetic mean, summation of all the data points divided by the number of data points. And similarly, for the second raw moment, we keep the value of r = 2, and it can be written as:

\mu{^\prime}_{2} = \frac {1}{n} \sum_{i=1}^n {{x_i}^2 }

The equation of variance: Var(x) = \frac {1}{n} \sum_{i=1}^n { {(x_i - \bar{x})}^2 }, where \bar{x} = \sum_{i=1}^n {x_i}. If we expand it, then we will get the following expression:

\frac {1}{n} \sum_{i=1}^n {x_i}^2 - \bar{x}^2

By substituting the first raw moment and second raw moment in the above expression, the final equation of variance looks as follows:

Var(x) = \mu{^\prime}_{2}\mu{^\prime}_{1}^2

Thus, the raw moment helps us to define the variance of the variable for discrete and continuous data.

Central Moments

The moments of a variable X about the arithmetic mean \bar{x} are called central moments. The r^{th} central moment based on observations x_1, x_2, x_3 ..... x_n is defined as follows:

Equations for central moment for grouped and ungrouped data

For ungrouped data it is represented as follows:

\mu_{r} = \frac {1}{n} \sum_{i=1}^n { {(x_i - \bar{x})}^r }

For grouped data: Inputs are always categorized as class intervals, and it can be understood as follows:

Suppose we have observations on a variable X and having k class intervals such as  e_1 - e_2, e_2 - e_3, .... e_{k-1} - e_k in a frequency table. The midpoint value is obtained for each interval is as follows:

 x_i = \frac{e_i + e_j}{2} , where i < j

and associated absolute frequency is f_i for the class interval  e_i - e_j . The f_i represents a number of observations belong to the class interval  e_i - e_j . The sum of all the absolute frequencies must be n = \sum_{i=1}^k {f_i} and \bar{x} = \frac{1}{n} \sum_{i=1}^k {f_i \times x_i}.

\mu_{r} = \frac {1}{n} \sum_{i=1}^k {f_i \times {(x_i  - \bar{x})}^r } 

Central Moments – first and second central moments

The first and second central moments are obtained by substituting r = 1, and r = 2 respectively. We will understand these concepts for both discrete and continuous data type.

For discrete or ungrouped data type, the first moment is always zero. It is illustrated as follows:

\mu_{1} = \frac {1}{n} \sum_{i=1}^n { {(x_i - \bar{x})}^1 }

= \frac {1}{n} \sum_{i=1}^n {x_i} - \frac {1}{n} \sum_{i=1}^n \times 1 \times \bar{x}

= \bar{x} - \frac {n}{n} \times \bar{x}

= 0

The second moment for ungrouped data type is equivalent to variance. It is computed by taking r = 2 in the general equation of moment for ungrouped data and it looks as follows:

\mu_{2} = \frac {1}{n} \sum_{i=1}^n { {(x_i - \bar{x})}^2 }

= \frac {1}{n} \sum_{i=1}^n {x_i}^2 - \bar{x}^2

= \mu{^\prime}_{2}\mu{^\prime}_{1}^2

thus, \mu_{2} = \mu{^\prime}_{2}\mu{^\prime}_{1}^2, and it shows the relationship between the 2nd central moment and the 2nd and 1st raw moment

In conclusion, the first central moment is always zero, and the second raw and central moment represent data variability.

Relationship between Central and Raw Moments

  • When r = 0, raw and central moments
    • \mu_{0} = \mu{^\prime}_{2} = 0
  • When r = 1, raw and central moments
    • \mu_{1} = 1
  • When r = 2, raw and central moments
    • \mu_{2} = \mu{^\prime}_{2}\mu{^\prime}_{1}^2
  • When r = 3, raw and central moments
    • \mu_{3} = \mu{^\prime}_{3} - 3 \mu{^\prime}_{1}  \mu{^\prime}_{2} + 2  \mu{^\prime}_{1}^3
  • When r = 4, raw and central moments
    • \mu_{4} = \mu{^\prime}_{4} - 4 \mu{^\prime}_{3} \mu{^\prime}_{1} + 6 \mu{^\prime}_{2} \mu{^\prime}_{1}^2 - 3 \mu{^\prime}_{1}^4

References

  1. Descriptive Statistic, By Prof. Shalabh, Dept. of Mathematics and Statistics, IIT Kanpur.

 2,571 total views,  1 views today

Scroll to Top
Scroll to Top
%d bloggers like this: