Reputation: 4828
>>> a = array([[10, 50, 20, 30, 40],
... [50, 30, 40, 20, 10],
... [30, 20, 20, 10, 50]])
>>> some_np_expression(a)
array([[1, 3, 1, 3, 2],
[3, 2, 3, 2, 1],
[2, 1, 2, 1, 3]])
What is some_np_expression
? Don't care about how ties are settled so long as the ranks are distinct and sequential.
Upvotes: 6
Views: 2198
Reputation: 477
from scipy.stats.mstats import rankdata
import numpy as np
a = np.array([[10, 50, 20, 30, 40],
[50, 30, 40, 20, 10],
[30, 20, 20, 10, 50]])
rank = (rankdata(a, axis=0)-1).astype(int)
The output will be as follows.
array([[0, 2, 0, 2, 1],
[2, 1, 2, 1, 0],
[1, 0, 0, 0, 2]])
Upvotes: 0
Reputation: 1492
Now Scipy offers a function to rank data with an axis argument - you can set along what axis you want to rank the data.
from scipy.stats.mstats import rankdata
a = array([[10, 50, 20, 30, 40],
[50, 30, 40, 20, 10],
[30, 20, 20, 10, 50]])
ranked_vertical = rankdata(a, axis=0)
Upvotes: 3
Reputation: 114811
Double argsort is a standard (but inefficient!) way to do this:
In [120]: a
Out[120]:
array([[10, 50, 20, 30, 40],
[50, 30, 40, 20, 10],
[30, 20, 20, 10, 50]])
In [121]: a.argsort(axis=0).argsort(axis=0) + 1
Out[121]:
array([[1, 3, 1, 3, 2],
[3, 2, 3, 2, 1],
[2, 1, 2, 1, 3]])
With some more code, you can avoid sorting twice. Note that I'm using a different a
in the following:
In [262]: a
Out[262]:
array([[30, 30, 10, 10],
[10, 20, 20, 30],
[20, 10, 30, 20]])
Call argsort
once:
In [263]: s = a.argsort(axis=0)
Use s
to construct the array of rankings:
In [264]: i = np.arange(a.shape[0]).reshape(-1, 1)
In [265]: j = np.arange(a.shape[1])
In [266]: ranked = np.empty_like(a, dtype=int)
In [267]: ranked[s, j] = i + 1
In [268]: ranked
Out[268]:
array([[3, 3, 1, 1],
[1, 2, 2, 3],
[2, 1, 3, 2]])
Here's the less efficient (but more concise) version:
In [269]: a.argsort(axis=0).argsort(axis=0) + 1
Out[269]:
array([[3, 3, 1, 1],
[1, 2, 2, 3],
[2, 1, 3, 2]])
Upvotes: 7