Descriptive Statistics – Measures of Association – Rank Correlation

In the previous notes on descriptive statistics, we have seen the method to measure the association between two continuous variables using the Karl Pearson Correlation Coefficient and learned the different kinds of correlation conditions. This note will show the technique to measure the association between two ordinal variables (or rank data) using spearman’s Rank Correlation Coefficient.

Understanding of Ordinal data (Rank data)

Suppose there are n candidates participate in a talent competition. And entire competition is judged by two judges, X and Y. They have given their rank or score to every participant. The scores are indicated by two judges in the form of (x1, y1) for the first candidate and similarly for other candidates. The rank of x1 may or may not be the same as y1 for the first candidate. These two judges give scores independently to each candidate.

Understanding of ordinal data
Understanding of ordinal data

Judge X gives ranks to n candidates as, rank 1 to worst candidate with the lowest score as xi and rank 2 to a candidate with second-lowest score xi, and rank n to the best candidate with highest score xi. Similarly, judge Y ranks n candidates as 1,2,3,… n based on scores y1, y2, y3, …. yn. In the same way, as judge X gave.

Every participant has two ranks given by two different judges. We expect both judges to provide the higher ranks to suitable candidates and lower ranks to bad candidates. Our objective is to measure the degree of association between the two different judgments. i.e., the two different sets of ranks.

Spearman’s rank correlation coefficient

From the above example, to measure the degree of agreement between the ranks of two judges, we use Spearman’s rank correlation coefficient. Note, even though judges can give candidates scores, Spearman’s rank correlation coefficient only works on rank data. These scores can be easily converted into rank either in ascending or descending order.

  • Rank(x_i) : Rank of i^{th} observation on X and the rank of x_i among ordered values x_1, x_2, x_3, \cdots x_n of X.
  • Rank(y_i) : Rank of i^{th} observation on Y and the rank of y_i among ordered values y_1, y_2, y_3, \cdots y_n of Y.
  •  d_i = Rank(x_i) - Rank(y_i)

The Spearman’s rank correlation coefficient (R) is defined as: 

 R = 1 - \frac{ 6 \sum_{i=1}^n d_i^2 }{n(n^2-1)}     ; -1 \leq R \leq 1

Interpretation of Spearman’s rank correlation coefficient

It does not matter whether the ascending or descending order of ranks is used. When both the judges assign exactly the same ranks to all the candidates then R = +1, and when opposite ranks to all the candidates then R = -1. 

Example 1: Scores given by two judges to 5 candidates are as follows: 

Example 1: Spearman's rank correlation coefficient
Example 1: Spearman’s rank correlation coefficient

 R = 1 - \frac{ 6 \sum_{i=1}^n d_i^2 }{n(n^2-1)}  = 1 - \frac{ 6 \sum_{i=1}^5 d_i^2 }{5(5^2-1)} = -0.7  

Spearman’s rank correlation coefficient is -0.7, which means both the judges have given almost opposite scores to the different candidates.

References

  1. Descriptive Statistic, By Prof. Shalabh, Dept. of Mathematics and Statistics, IIT Kanpur.

 132 total views,  1 views today

Scroll to Top
Scroll to Top