Descriptive Statistics – Measures of association – Quantile-Quantile Plots

In the previous note of descriptive statistics, we have introduced bivariate scatter plots and learned the technique to analyze the relationship between two variables. Moreover, this note will learn about other graphical tools called Quantile-Quantile (Q-Q) plot to find the association between two variables differently.

Need for Quantile-Quantile (Q-Q) plots

Suppose we received two samples from some population, but it is not sure whether samples have been drawn from the same or different populations. Our task is to figure out whether samples are drawn from the same or different populations and these types of situations are very useful for several statistical inferences.

Example1: 

In testing the hypothesis, we do one sample test or two samples test, and so on. In those types of tests, there is a requirement to ensure that samples come from a particular population. Based on the assumption about population, some probability density function will be considered or used. For example, if samples are drawn from the Gaussian distribution, a known probability density function will be directly used to compute the different parameters. 

Example2: 

Suppose X_1, X_2, X_3, .., X_n is the random samples from Gaussian (also called Normal) distribution (population) with mean \mu and variance \sigma^2. It is understood that there is a population that is very big and practically unknown to us, and the normal density function characterizes this population. If our goal is to figure out whether samples are coming from the normal population or not, then Q-Q plots are helpful.

We need to know about the samples, as most statistical tools such as z-test, t-test, and others are used in the hypothesis test. These tests default assume that the normal distribution characterizes the population. So until this assumption is verified, the further statistical inferences will be questionable.

Example3:

To find out whether the given samples data belongs to a particular distribution or not. 

Quantile-Quantile Plots

It summarizes whether the distributions of two variables are similar or not with respect to the location. To make the comparison, we plot the quantiles of the variables against each other. When the quantiles of two variables are plotted against each other, we get the quantile-quantile plot.

In simple words, if quantiles (like 25% 50%, 75%, 100%) of two variables are similar, then the plot obtained from those two variables would overlap with each other and must form a straight line.

Q-Q plot Interpretations

QQ plot - Samples having the similar distribution
QQ plot – Samples having the similar distribution

All points of quantiles lie on or close to a straight line at an angle of 45 degrees from the x-axis, then it indicates the two samples have similar distributions.

QQ plot - Samples do not have the same distribution
QQ plot – Samples do not have the similar distribution

The y axis quantiles are lower than the x quantiles. Then it shows y values tend to be lower than x values. Hence the distributions are not the same.

QQ plot - Samples do not have the same distribution
QQ plot – Samples do not have the same distribution

The x-axis quantiles are lower than the y-axis quantiles. Then it indicates x values tend to be lower than the y values. Hence the distributions are not the same.

QQ plot - Samples do not have the same distribution
QQ plot – Samples do not have the same distribution

Indicates that there is a break point up to which the y quantiles are lower than the x quantiles and after that point, the y quantiles are higher than the x quantiles.

References

  1. Descriptive Statistic, By Prof. Shalabh, Dept. of Mathematics and Statistics, IIT Kanpur.

 173 total views,  1 views today

Scroll to Top
Scroll to Top