How to compare ranked lists

Valerio Schiavoni picture Valerio Schiavoni · Nov 26, 2012 · Viewed 13.6k times · Source

I have two lists of ranked items. Each item has an rank and an associated score. The score has decided the rank. The two lists can contains (and usually do) different items, that is their intersection can be empty. I need measures to compare such rankings. Are there well-known algorithms (in literature or real-world systems) to do so ? The measure of distance should take into account the scores as well as the ranks of the items.

Answer

Mike Dooley picture Mike Dooley · Mar 24, 2016

This question has never been answered before, but I still think it's important to a lot of people out there:

Your two requirements, i.e. non-conjointness of lists and importance of ranks are not met by common correlation tests. In addition to that most of them (Kendall-Tau for example) do not take the order into account:

>>> from scipy.stats import kendalltau
>>> kendalltau([1,2,3,4,5], [2,1,3,4,5])
KendalltauResult(correlation=0.79999999999999982, value=0.050043527347496564)
>>> kendalltau([1,2,3,4,5], [1,2,3,5,4])
KendalltauResult(correlation=0.79999999999999982, value=0.050043527347496564)

The 1st comparison should yield a significantly smaller value than the 2nd one, because the head of the list is more important than the tail (2nd requirement).

In addition to that one can see that both lists need to be the same size and have the same kind of elements (1st requirement)

Possible solution:

The measure that satisfies all your needs is called Rank Biased Overlap. It's a generalization of the so called average based overlap, which is wonderfully illustrated in this blog. The same guy also put out an implementation of RBO.

Update Jan 2018:

  • Another implementation of RBO for python 3.5.2