RedForty
RedForty

Reputation: 93

Rank a list of numbers, allowing for ties

Lets say I have a list like this:

newIndexVertList = [0, 1, 2, 2, 1, 20, 21, 21, 20, 3, 23, 22]

I want to convert it to:

newIndexVertList = [0, 1, 2, 2, 1, 4, 5, 5, 4, 3, 7, 6]

Here, the transformation is based on the position of number in ascending order in the original list. Hence, in the new list numbers are replaced based on the logic:

0   -->  0    0th position in sorted list
1   -->  1    1st position in sorted list
2   -->  2    2nd position in sorted list
3   -->  3    3rd position in sorted list
20  -->  4    4th position in sorted list
21  -->  5    5th position in sorted list
22  -->  6    6th position in sorted list
23  -->  7    7th position in sorted list

Below is my code to achieve this:

c = 0
for i in xrange(len(newIndexVertList)):
    if c < newIndexVertList[i]:
        newIndexVertList[i] = c
        c += 1
        continue
    elif c == newIndexVertList[i]:
        c += 1
        continue
    else:
        continue

# actual output:   [0, 1, 2, 2, 1, 3, 4, 5, 6, 3, 7, 8]
# expected output: [0, 1, 2, 2, 1, 4, 5, 5, 4, 3, 7, 6]

What is the issue with my code? What is the elegant way to achieve this?

Since my vertex lists will be in the 100k range, I'm looking for the quickest execution.

Upvotes: 2

Views: 2432

Answers (2)

wim
wim

Reputation: 362954

You've mentioned in the comments your data will be large (100k) and you're looking for the quickest execution. You should consider using numpy:

>>> vertices = [0, 1, 2, 2, 1, 20, 21, 21, 20, 3, 23, 22]
>>> np.unique(vertices, return_inverse=True)[1]
array([0, 1, 2, 2, 1, 4, 5, 5, 4, 3, 7, 6])

For a 100k long array of integers randomly distributed between 0 and 100, this is more than 3x faster than the currently accepted answer.

Another performant option suggested by user DSM in the Python chat room, is using scipy.stats to rank the data:

>>> import scipy.stats
>>> (scipy.stats.rankdata(vertices, 'dense') - 1).astype(int)
array([0, 1, 2, 2, 1, 4, 5, 5, 4, 3, 7, 6])

Upvotes: 3

Moinuddin Quadri
Moinuddin Quadri

Reputation: 48090

You may achieve it via creating intermediate dict object to map the number with its position in the original list via using sorted() and set() with enumerate():

>>> my_list = [0, 1, 2, 2, 1, 20, 21, 21, 20, 3, 23, 22]
>>> num_map  = {j: i for i, j in enumerate(sorted(set(my_list)))}
#                                             ^    ^ to get unique elements
#                                             ^ sort numbers in ascending order

>>> [num_map[n] for n in my_list]
[0, 1, 2, 2, 1, 4, 5, 5, 4, 3, 7, 6]

As commented by Stefan, it could be achieved in one-line using map() as:

list(map({j: i for i, j in enumerate(sorted(set(my_list)))}.get, my_list))
# ^ type-cast `map` object to `list` for Python 3.x compatibility

Upvotes: 7

Related Questions