Reputation: 105
I want to return the 'reverse' indices of a sorted list. What I mean by that is: I have an unsorted list U
and I sort it via S=sorted(U)
. Now, I can get the sort indices such that U(idx)=S
- but I want S(Ridx) = U
.
Here a little example:
U=[5,2,3,1,4]
S=sorted(U)
idx = [U.index(S[i]) for i in range(len(U))]
>>> idx
[3, 1, 2, 4, 0]
Ridx = [S.index(U[i]) for i in range(len(U))]
>>> Ridx
[4, 1, 2, 0, 3]
>>>[U[idx[i]] for i in range(len(U))] == S
True
>>>[S[Ridx[i]] for i in range(len(U))] == U
True
What I need is an efficient way to get Ridx.
Thanks!
Edit:
All right! I did a little speed test for both of the solutions (@Jon Clements and @Whatang) which answered the question.
The script:
import datetime as DT
import random
U=[int(1000*random.random()) for i in xrange(pow(10,8))]
S=sorted(U)
idx = sorted(xrange(len(U)), key=U.__getitem__)
T0 = DT.datetime.now()
ridx = sorted(xrange(len(U)), key=idx.__getitem__)
print [S[ridx[i]] for i in range(len(U))]==U
elapsed = DT.datetime.now()-T0
print str(elapsed)
print '==============='
T0 = DT.datetime.now()
ridx = [ y for (x,y) in sorted(zip(idx, range(len(idx)))) ]
print [S[ridx[i]] for i in range(len(U))]==U
elapsed = DT.datetime.now()-T0
print str(elapsed)
And the results:
True
0:02:45.278000
===============
True
0:06:48.889000
Thank you all for the quick and meaningful help!
Upvotes: 5
Views: 4098
Reputation: 25652
With numpy you can do
>>> import numpy as np
>>> U = [5, 2, 3, 1, 4]
>>> np.array(U).argsort().argsort()
array([4, 1, 2, 0, 3])
Upvotes: 2
Reputation: 10356
Assuming you already have the list idx
, you can do
ridx = [ y for (x,y) in sorted(zip(idx, range(len(idx)))) ]
Then for all i
from 0 to len(U)
S[ridx[i]] == U[i]
You can avoid the sort if you use a dictionary:
ridx_dict = dict(zip(idx, range(len(idx))))
which can then be converted to a list:
ridx = [ ridx_dict[k] for k in range(len(idx)) ]
Thinking about permutations is the key to this problem. One way of writing down a permutation is to write all the indexes in order on one line, then on the line below write the new index of the element with that index. e.g., for your example
0 1 2 3 4
3 1 2 4 0
This second line is your idx
list. You read down the columns, so the element which starts at index 0 moves to index 3, the element which starts at index 1 stays at index 1, and so on.
The inverse permutation is the ridx
you're looking for. To find this, sort the lower line of the your permutation keeping columns together, then write down the new top line. So the example becomes:
4 1 2 0 3
0 1 2 3 4
Upvotes: 1
Reputation: 142106
The most efficient I can think of (short of possibly looking to numpy
) that gets rid of the .index
and can be used for both idx
and ridx
:
U=[5,2,3,1,4]
idx = sorted(xrange(len(U)), key=U.__getitem__)
ridx = sorted(xrange(len(U)), key=idx.__getitem__)
# [3, 1, 2, 4, 0] [4, 1, 2, 0, 3]
Upvotes: 5
Reputation: 239
If I understand the question correctly (which I didn't) I think U.index(S[i]) is what you are looking for
EDIT: so I guess you could save a dictionary of the original indices and keep the retrieval syntax pretty simple
OIDX = {U[i]: i for i in range(0, len(U))}
S = sorted(U)
OIDX[S[i]]
Upvotes: 0
Reputation: 45542
Not quite the data structure you asked for, but I think this gets the info you want:
>>> sorted(x[::-1] for x in enumerate(['z', 'a', 'c', 'x', 'm']))
[('a', 1), ('c', 2), ('m', 4), ('x', 3), ('z', 0)]
Upvotes: 2