We have covered what variability is and how to measure variability using specific values or partitioning values in the earlier note. Moreover, in this note, we will start measuring variation or dispersion based on the deviation.
Deviation based measures of variation
We need a tool that can measure the deviation of every observation around any given value. However, we mainly consider the mean and measure around the mean value. This approach is called deviation-based measures of variation.
Suppose the deviation of any observation from any value A is measured as . It gives the difference between A and , and the difference value lies in the following range:
- if , then such deviations are positive.
- if , then such deviations are negative.
- if , then such deviations are zero.
We can’t conclude by seeing all these ‘s values. So a better way is to summarize it to a single quantity. If we consider the average of these deviations, ‘s, then the average value:
Deviation =
where are the differences between A and respective values.
The resultant average value may be close to zero and reflect no variation or a slight variation, which may be incorrect. So we need to consider only the magnitudes of the deviations while dropping the signs.
The main reason to consider the magnitude and dropping the direction is that, during averaging, direction values may cancel out, and as a consequence, it produces an incorrect result.
There are two ways to achieve it. Either we take absolute value or make the negative values to be squared. And based on these two aspects, we have two types of measures. The first is Absolute Deviation, and the second is Variance.
Absolute Deviation
There are discrete (ungrouped) and continuous (grouped) variable types of datasets. In discrete variables, we try to use the observations as such. But, in the case of a continuous variable, we try to group them based on the class intervals, convert the data into a frequency table, and extract mid-values of the class intervals and the corresponding frequency to construct the statistical measures.
Absolute Deviation for Discrete Data
Suppose we have n observations, on a variable X. To calculate the absolute deviation of all the observations, we need a reference value (A), and it can be any value from the observation or derived value. In general, it looks like as follows:
, for all n observations
While doing that, we ignore negative signs and only takes positive difference value. For example, absolute value of | 5 – 10 | and |10 – 5 | is 5 only. In the end, we will sum of all absolute values and divided by number of observations. It is represented as follows:
Absolute deviation =
Absolute Deviation for Continous Data
Suppose we have observations on a variable X and having k class intervals such as in a frequency table. The midpoint value is obtained for each interval is as follows:
, where i < j
and associated absolute frequency is for the class interval . The represents a number of observations belong to the class interval . The sum of all the absolute frequencies must be n = .
Absolute deviation =
Average Absolute Deviation
In the absolute deviation, we have discussed that we can calculate the absolute deviation from any value A. However, when we take any value equal to any one of the measures of central tendencies, such as mean, median, or mode, it is called mean absolute deviation, median absolute deviation, and mode absolute deviation, respectively.
It is usually seen that the median absolute deviation is less than or equal to the mean absolute deviation and even the absolute deviation from any value.
In the note of measures of central tendency, we had discussed how to measure mean, median, and mode for both discrete and continuous datasets. So while calculating, the same formula must use; otherwise, the whole calculation will product the wrong result.
To compute average absolute deviation, we need continuous or discrete values. Instead of creating dummy data, we will use the tips dataset and calculate the average absolute deviation using the Python programming language. Also, we will analyze the average absolute deviation by considering the measures of central tendency using mean, median, and mode on the same variable.
Mean Absolute Deviation
Suppose we have n observations, on a discrete or ungrouped variable X. The compution of the mean absolute deviation are as follows:
- , it is a sample mean, and from the sample mean, we will find the deviations from all the observations such as, .
- Mean Absolute Deviation =
Median Absolute Deviation
Suppose we have n observations, on a discrete or ungrouped variable X. The compution of the median absolute deviation are as follows:
- Let us consider as a median of the observations. To know how to compute median, kindly refer to measures of central tendency using median note.
- Now we will find the deviations of all the observations from the median value. This is performed as follows: for all the observations. Once we have all the deviations values, we will again find the median of computed values.
- Median Absolute Deviation = Median(), where i = 0 to n deviations values from the median.
References
- Descriptive Statistic, By Prof. Shalabh, Dept. of Mathematics and Statistics, IIT Kanpur.
- https://en.wikipedia.org/wiki/Average_absolute_deviation
111 total views, 1 views today