ZappaZ
ZappaZ

Reputation: 105

Reverse indices of a sorted list

I want to return the 'reverse' indices of a sorted list. What I mean by that is: I have an unsorted list U and I sort it via S=sorted(U). Now, I can get the sort indices such that U(idx)=S - but I want S(Ridx) = U.

Here a little example:

U=[5,2,3,1,4]

S=sorted(U)

idx = [U.index(S[i]) for i in range(len(U))]
>>> idx
[3, 1, 2, 4, 0]

Ridx = [S.index(U[i]) for i in range(len(U))]
>>> Ridx
[4, 1, 2, 0, 3]

>>>[U[idx[i]] for i in range(len(U))] == S
True

>>>[S[Ridx[i]] for i in range(len(U))] == U
True

What I need is an efficient way to get Ridx.

Thanks!


Edit:

All right! I did a little speed test for both of the solutions (@Jon Clements and @Whatang) which answered the question.

The script:

import datetime as DT
import random

U=[int(1000*random.random()) for i in xrange(pow(10,8))]

S=sorted(U)

idx = sorted(xrange(len(U)), key=U.__getitem__)

T0 = DT.datetime.now()
ridx = sorted(xrange(len(U)), key=idx.__getitem__)
print [S[ridx[i]] for i in range(len(U))]==U
elapsed = DT.datetime.now()-T0
print str(elapsed)

print '==============='
T0 = DT.datetime.now()
ridx = [ y for (x,y) in sorted(zip(idx, range(len(idx)))) ]
print [S[ridx[i]] for i in range(len(U))]==U
elapsed = DT.datetime.now()-T0
print str(elapsed)

And the results:

True
0:02:45.278000
===============
True
0:06:48.889000

Thank you all for the quick and meaningful help!

Upvotes: 5

Views: 4098

Answers (5)

Phillip Cloud
Phillip Cloud

Reputation: 25652

With numpy you can do

>>> import numpy as np
>>> U = [5, 2, 3, 1, 4]

>>> np.array(U).argsort().argsort()
array([4, 1, 2, 0, 3])

Upvotes: 2

Whatang
Whatang

Reputation: 10356

Assuming you already have the list idx, you can do

ridx = [ y for (x,y) in sorted(zip(idx, range(len(idx)))) ]

Then for all i from 0 to len(U)

S[ridx[i]] == U[i]

You can avoid the sort if you use a dictionary:

ridx_dict = dict(zip(idx, range(len(idx))))

which can then be converted to a list:

ridx = [ ridx_dict[k] for k in range(len(idx)) ]

Thinking about permutations is the key to this problem. One way of writing down a permutation is to write all the indexes in order on one line, then on the line below write the new index of the element with that index. e.g., for your example

0 1 2 3 4
3 1 2 4 0

This second line is your idx list. You read down the columns, so the element which starts at index 0 moves to index 3, the element which starts at index 1 stays at index 1, and so on.

The inverse permutation is the ridx you're looking for. To find this, sort the lower line of the your permutation keeping columns together, then write down the new top line. So the example becomes:

4 1 2 0 3
0 1 2 3 4

Upvotes: 1

Jon Clements
Jon Clements

Reputation: 142106

The most efficient I can think of (short of possibly looking to numpy) that gets rid of the .index and can be used for both idx and ridx:

U=[5,2,3,1,4]
idx = sorted(xrange(len(U)), key=U.__getitem__)
ridx = sorted(xrange(len(U)), key=idx.__getitem__)
# [3, 1, 2, 4, 0] [4, 1, 2, 0, 3]

Upvotes: 5

Jason M
Jason M

Reputation: 239

If I understand the question correctly (which I didn't) I think U.index(S[i]) is what you are looking for

EDIT: so I guess you could save a dictionary of the original indices and keep the retrieval syntax pretty simple

OIDX = {U[i]: i for i in range(0, len(U))}
S = sorted(U)
OIDX[S[i]]

Upvotes: 0

Steven Rumbalski
Steven Rumbalski

Reputation: 45542

Not quite the data structure you asked for, but I think this gets the info you want:

>>> sorted(x[::-1] for x in enumerate(['z', 'a', 'c', 'x', 'm']))
[('a', 1), ('c', 2), ('m', 4), ('x', 3), ('z', 0)]

Upvotes: 2

Related Questions