In the earlier note, we have studied the quantitative approach to compare the variance of multiple variables using the Coefficient of Variance (CV). Moreover, in this note, we will compare variance using a graphical method called boxplot. It is also called a five-number summary of data.
BoxPlot
Box plot is a graph that summarizes the distribution of a variable by using its median, quartiles, minimum and maximum values. It is useful in comparing different datasets.
- Minimum Value: It gives the information about the minimum value of a variable.
- Maximum Value: It gives the information about the maximum value of a variable.
These are very useful when we want to find the range or spread of data points of a variable, and they can be computed just by subtracting the maximum and minimum values.
The first, second, and third quartile represent a box, the lower edge represents the first quartile (25%) value, and the upper edge of the box represents the third quartile (75%) value.
- To find interquartile range (IQR), we need to subtract the third quartile and first quartile.
- The second quartile is a median and gives the central tendency of data.
By looking boxplot, we get a clear picture of the data or a variable. It is also called five number summary of data.
Comparison of variables through boxplots using Python
References
- Descriptive Statistic, By Prof. Shalabh, Dept. of Mathematics and Statistics, IIT Kanpur.
494 total views, 1 views today