Reputation: 286
I have the following list of tuples:
[(1, 6), (2, 3), (2, 5), (2, 2), (1, 7), (3, 2), (2, 2)]
I would like to rank this list by the first value in the tuple and resolve ties by the second value, so that the output looks like this:
[1, 5, 6, 3, 2, 7, 3]
I couldn't think of a simple way of doing this, so I was looking for something like the scipy.stats.rankdata function. However, for my use-case it's missing something like the order argument in numpy.argsort. I feel like I'm missing something obvious here, in which case I apologise for not googling my answer better!
EDIT:
To explain better what I am trying to achieve:
Given a list of tuples
>>> l = [(1, 6), (2, 3), (2, 5), (2, 2), (1, 7), (3, 2), (2, 2)]
I want to create a list containing the rank of the elements of the list l. For example, ranking by the first value in each tuple:
>>> from scipy import stats
>>> stats.rankdata([i for i, j in l], method='min')
array([ 1., 3., 3., 3., 1., 7., 3.])
This is almost what I wanted, however there are ties in the list (there's two times 1. and four times 3.).
I would like to break the ties using the second value in each tuple, so that for example the two tuples (2, 2) will have the same rank, but the (2, 3) and (2, 5) will have a different rank. The resulting list should look like this:
array([ 1., 5., 6., 3., 2., 7., 3.])
Upvotes: 3
Views: 1105
Reputation: 286
Thanks to Ignacio Vazquez-Abrams' answer I managed to find a solution! It's perhaps not the most efficient way to do this, but it works.
>>> import operator
>>> from scipy import stats
>>> l = [(1, 6), (2, 3), (2, 5), (2, 2), (1, 7), (3, 2), (2, 2)]
>>> uniq = list(set(t for t in l))
>>> s = sorted(uniq)
>>> r = [s.index(i) for i in l]
>>> rank = stats.rankdata(r, method='min')
>>> rank
array([ 1., 5., 6., 3., 2., 7., 3.])
Upvotes: 1
Reputation: 798884
Python sorts sequences naturally.
>>> [x for x, y in sorted(enumerate([(1, 6), (2, 3), (2, 5), (2, 2), (1, 7), (3, 2), (2, 2)], start=1), key=operator.itemgetter(1))]
[1, 5, 4, 7, 2, 3, 6]
Upvotes: 4