Steve
Steve

Reputation: 2854

How do I rank a list in vanilla Python?

Let's say I have an list:

[4, 5, 2, 1]

I need to rank these and have the output as:

[3, 4, 2, 1]

If two have the same ranking in the case:

[4, 4, 2, 3] then the rankings should be averaged -> [3.5, 3.5, 1, 2]

EDIT
Here rank stands for position of number in a sorted list. If there are multiple numbers with same value, then rank of each such number will be average of their positions.

Upvotes: 0

Views: 697

Answers (2)

Steve
Steve

Reputation: 2854

I found an answer to this here: Efficient method to calculate the rank vector of a list in Python

def rank_simple(vector):
    return sorted(range(len(vector)), key=vector.__getitem__)

def rankdata(a):
    n = len(a)
    ivec=rank_simple(a)
    svec=[a[rank] for rank in ivec]
    sumranks = 0
    dupcount = 0
    newarray = [0]*n
    for i in xrange(n):
        sumranks += i
        dupcount += 1
        if i==n-1 or svec[i] != svec[i+1]:
            averank = sumranks / float(dupcount) + 1
            for j in xrange(i-dupcount+1,i+1):
                newarray[ivec[j]] = averank
            sumranks = 0
            dupcount = 0
    return newarray

I would like to see if there are any simpler or more efficient ways of doing this.

Upvotes: 0

tzaman
tzaman

Reputation: 47790

Probably not the most efficient, but this works.

  • rank takes a sorted list and an item, and figures out the rank of that item should be by finding where it would be inserted to go before all elements that are equal to it, and after, then averaging the two positions (using array bisection).
  • rank_list uses rank to figure out the ranks of all elements. The partial call is just to simplify, and not have to sort the list again for each item lookup.

Like so:

from bisect import bisect_left, bisect_right
from functools import partial

def rank(item, lst):
    '''return rank of item in sorted list'''
    return (1 + bisect_left(lst, item) + bisect_right(lst, item)) / 2.0

def rank_list(lst):
    f = partial(rank, lst=sorted(lst))
    return [f(i) for i in lst]

rank_list([4, 4, 2, 1])
## [3.5, 3.5, 2.0, 1.0]

Upvotes: 2

Related Questions