Reputation: 846
I have a numpy array as follows,
arr = np.array([0.166667, 0., 0., 0.333333, 0., 0.166667, 0.166667, np.nan]
I wish to rank above array in descending order such that the highest value
gets 1. and np.nan
gets the last value but without incrementing the rank during value repetitions!
Expectation:
ranks = [2, 3, 3, 1, 3, 2, 2, 4]
i.e.
>>>>
1 0.333333
2 0.166667
2 0.166667
2 0.166667
3 0.0
3 0.0
3 0.0
4 -inf
What I have accomplished so far is below,
I used np.argsort twice and filled the np.nan value with the lowest float possible but the ranks increment even with the same value!
# The Logic
arr = np.nan_to_num(arr, nan=float('-inf'))
ranks = list(np.argsort(np.argsort(arr)[::-1]) + 1)
# Pretty Print
sorted_ = sorted([(r, a) for a, r, in zip(arr, ranks)], key=lambda v: v[0])
for r, a in sorted_:
print(r, a)
>>>>
1 0.333333
2 0.166667
3 0.166667
4 0.166667
5 0.0
6 0.0
7 0.0
8 -inf
Any idea on how to manage the ranks without increments?
https://repl.it/@MilindDalvi/MidnightblueUnselfishCategories
Upvotes: 3
Views: 211
Reputation: 4537
numpy.unique
sorts the unique values ascending, so using -arr
gives you the correct order. The index for reversing this operation is exactly your rank (minus one).
arr_u, inv = np.unique(-arr, return_inverse=True)
rank = inv + 1
Upvotes: 0
Reputation: 3988
Here's one approach:
v = sorted(arr, reverse = 1)
for i,j in enumerate(set(v)):
if np.isnan(j): k = i+1
print([list(set(v)).index(i)+1 if not np.isnan(i) else k for i in arr])
Output
[2, 3, 3, 1, 3, 2, 2, 4]
Upvotes: 0
Reputation: 458
Not necessarily a better way - just another way of approaching this issue
arr = sorted(np.array([0.166667, 0., 0., 0.333333, 0., 0.166667, 0.166667, np.nan]), reverse=True)
count = 1
mydict = {}
for a in arr:
if a not in mydict:
mydict[a] = count
count += 1
for i in arr:
print(mydict[i], i)
Upvotes: 0
Reputation: 88236
Here's a pandas approach using DataFrame.rank
setting method="min"
and na_option ='bottom'
:
s = pd.Series(arr).rank(method="min", na_option ='bottom', ascending=False)
u = np.sort(s.unique())
s.map(dict(zip(u, range(len(u))))).add(1).values
# array([2, 3, 3, 1, 3, 2, 2, 4], dtype=int64)
Upvotes: 1
Reputation: 1330
Try something like that before the last loop:
k = 1;
for i in (1, len(sorted_)):
if sorted_[i][1] != sorted_[i - 1][1] then
k = k + 1
sorted_[i][0] = k
Upvotes: 0