Reputation: 3129

How to get indices of a sorted array in Python

I have a numerical list:

myList = [1, 2, 3, 100, 5]

Now if I sort this list to obtain [1, 2, 3, 5, 100]. What I want is the indices of the elements from the original list in the sorted order i.e. [0, 1, 2, 4, 3] --- ala MATLAB's sort function that returns both values and indices.

Upvotes: 312

Answers (19)

Amit

Reputation: 2090

Some syntactic sugar changes and little embellishment, otherwise there are at least two similar answers in this post. I am posting this since unlike other posts, this post can get the "argsort" in descending manner as well.

from collections import namedtuple
from array import array
def argsort(seq, reverse=False):
    index = sorted(range(len(seq)), key=lambda x: seq[x] if not reverse else -seq[x])
    sorted_number = sorted(seq, reverse=reverse)
    argsorted = namedtuple('argsorted_info', ['sorting_index','sorted_number', 'original_number'])
    return argsorted(index, sorted_number, seq)

# Works for integers
a = argsort([3,1,7,5,9,4])
print(a)

b = argsort([3,1,7,5,9,4], reverse=True)
print(b)

# Works for float (beware of truncation)
a = argsort([1.3,1.1,1.7,1.5,1.9,1.4])
print(a)

b = argsort([1.3,1.1,1.7,1.5,1.9,1.4], reverse=True)
print(b)

# Works for array
c = argsort(array('l',[3,1,7,5,9,4]))
print(c)

d = argsort(array('l',[3,1,7,5,9,4]), reverse=True)
print(d)

gives the output as

argsorted_info(sorting_index=[1, 0, 5, 3, 2, 4], sorted_number=[1, 3, 4, 5, 7, 9], original_number=[3, 1, 7, 5, 9, 4])
argsorted_info(sorting_index=[4, 2, 3, 5, 0, 1], sorted_number=[9, 7, 5, 4, 3, 1], original_number=[3, 1, 7, 5, 9, 4])
argsorted_info(sorting_index=[1, 0, 5, 3, 2, 4], sorted_number=[1, 3, 4, 5, 7, 9], original_number=array('l', [3, 1, 7, 5, 9, 4]))
argsorted_info(sorting_index=[4, 2, 3, 5, 0, 1], sorted_number=[9, 7, 5, 4, 3, 1], original_number=array('l', [3, 1, 7, 5, 9, 4]))
argsorted_info(sorting_index=[1, 0, 5, 3, 2, 4], sorted_number=[1.1, 1.3, 1.4, 1.5, 1.7, 1.9], original_number=[1.3, 1.1, 1.7, 1.5, 1.9, 1.4])
argsorted_info(sorting_index=[4, 2, 3, 5, 0, 1], sorted_number=[1.9, 1.7, 1.5, 1.4, 1.3, 1.1], original_number=[1.3, 1.1, 1.7, 1.5, 1.9, 1.4])

Upvotes: 0

Denziloe

Reputation: 8131

Note that you'll often want to get the sorted values as well as the indices. Here's quick and nice way to do that (which doesn't waste time sorting everything twice):

myList = [1, 2, 3, 100, 5]

indices, values = zip(*sorted(enumerate(myList), key=lambda x: x[1]))

>>> indices
(0, 1, 2, 4, 3)
>>> values
(1, 2, 3, 5, 100)

Upvotes: 1

Dong Ren

Reputation: 1

a little confused for me.

The question is about "How to get indices of a sorted array in Python", I was understanding that How to get sorted indices of each elements in array.

So give my solution for someone who may need:

a = [1, 2, 3, 100, 5]
b = [m[0] for m in sorted([j for j in enumerate([i[0] for i in sorted(enumerate(a), key=lambda x:x[1])])],key=lambda x:x[1])]
print(b)

Upvotes: 0

ShadowRanger

Reputation: 155363

A variant on RustyRob's answer (which is already the most performant pure Python solution) that may be superior when the collection you're sorting either:

Isn't a sequence (e.g. it's a set, and there's a legitimate reason to want the indices corresponding to how far an iterator must be advanced to reach the item), or
Is a sequence without O(1) indexing (among Python's included batteries, collections.deque is a notable example of this)

Case #1 is unlikely to be useful, but case #2 is more likely to be meaningful. In either case, you have two choices:

Convert to a list/tuple and use the converted version, or
Use a trick to assign keys based on iteration order

This answer provides the solution to #2. Note that it's not guaranteed to work by the language standard; the language says each key will be computed once, but not the order they will be computed in. On every version of CPython, the reference interpreter, to date, it's precomputed in order from beginning to end, so this works, but be aware it's not guaranteed. In any event, the code is:

sizediterable = ...
sorted_indices = sorted(range(len(sizediterable)), key=lambda _, it=iter(sizediterable): next(it))

All that does is provide a key function that ignores the value it's given (an index) and instead provides the next item from an iterator preconstructed from the original container (cached as a defaulted argument to allow it to function as a one-liner). As a result, for something like a large collections.deque, where using its .__getitem__ involves O(n) work (and therefore computing all the keys would involve O(n²) work), sequential iteration remains O(1), so generating the keys remains just O(n).

If you need something guaranteed to work by the language standard, using built-in types, Roman's solution will have the same algorithmic efficiency as this solution (as neither of them rely on the algorithmic efficiency of indexing the original container).

To be clear, for the suggested use case with collections.deque, the deque would have to be quite large for this to matter; deques have a fairly large constant divisor for indexing, so only truly huge ones would have an issue. Of course, by the same token, the cost of sorting is pretty minimal if the inputs are small/cheap to compare, so if your inputs are large enough that efficient sorting matters, they're large enough for efficient indexing to matter too.

Upvotes: 0

Lorry

Reputation: 472

s = [2, 3, 1, 4, 5]
print([sorted(s, reverse=False).index(val) for val in s])

For a list with duplicate elements, it will return the rank without ties, e.g.

s = [2, 2, 1, 4, 5]
print([sorted(s, reverse=False).index(val) for val in s])

returns

[1, 1, 0, 3, 4]

Upvotes: 2

Nico Schlömer

Reputation: 58731

I did a quick performance check on these with perfplot (a project of mine) and found that it's hard to recommend anything else but

np.argsort(x)

(note the log scale):

Code to reproduce the plot:

import perfplot
import numpy as np


def sorted_enumerate(seq):
    return [i for (v, i) in sorted((v, i) for (i, v) in enumerate(seq))]


def sorted_enumerate_key(seq):
    return [x for x, y in sorted(enumerate(seq), key=lambda x: x[1])]


def sorted_range(seq):
    return sorted(range(len(seq)), key=seq.__getitem__)


b = perfplot.bench(
    setup=np.random.rand,
    kernels=[sorted_enumerate, sorted_enumerate_key, sorted_range, np.argsort],
    n_range=[2 ** k for k in range(15)],
    xlabel="len(x)",
)
b.save("out.png")

Upvotes: 54

J.Zhao

Reputation: 2330

firstly convert your list to this:

myList = [1, 2, 3, 100, 5]

add a index to your list's item

myList = [[0, 1], [1, 2], [2, 3], [3, 100], [4, 5]]

sorted(myList, key=lambda k:k[1])

result:

[[0, 1], [1, 2], [2, 3], [4, 5], [3, 100]]

Upvotes: 1

user15349912

Reputation:

Most easiest way you can use Numpy Packages for that purpose:

import numpy
s = numpy.array([2, 3, 1, 4, 5])
sort_index = numpy.argsort(s)
print(sort_index)

But If you want that you code should use baisc python code:

s = [2, 3, 1, 4, 5]
li=[]
  
for i in range(len(s)):
      li.append([s[i],i])
li.sort()
sort_index = []
  
for x in li:
      sort_index.append(x[1])
  
print(sort_index)

Upvotes: 3

DRV

Reputation: 756

Code:

s = [2, 3, 1, 4, 5]
li = []

for i in range(len(s)):
    li.append([s[i], i])
li.sort()
sort_index = []

for x in li:
    sort_index.append(x[1])

print(sort_index)

Try this, It worked for me cheers!

Upvotes: 0

MSeifert

Reputation: 152607

Essentially you need to do an argsort, what implementation you need depends if you want to use external libraries (e.g. NumPy) or if you want to stay pure-Python without dependencies.

The question you need to ask yourself is: Do you want the

indices that would sort the array/list
indices that the elements would have in the sorted array/list

Unfortunately the example in the question doesn't make it clear what is desired because both will give the same result:

>>> arr = np.array([1, 2, 3, 100, 5])

>>> np.argsort(np.argsort(arr))
array([0, 1, 2, 4, 3], dtype=int64)

>>> np.argsort(arr)
array([0, 1, 2, 4, 3], dtype=int64)

Choosing the `argsort` implementation

If you have NumPy at your disposal you can simply use the function numpy.argsort or method numpy.ndarray.argsort.

An implementation without NumPy was mentioned in some other answers already, so I'll just recap the fastest solution according to the benchmark answer here

def argsort(l):
    return sorted(range(len(l)), key=l.__getitem__)

Getting the indices that would sort the array/list

To get the indices that would sort the array/list you can simply call argsort on the array or list. I'm using the NumPy versions here but the Python implementation should give the same results

>>> arr = np.array([3, 1, 2, 4])
>>> np.argsort(arr)
array([1, 2, 0, 3], dtype=int64)

The result contains the indices that are needed to get the sorted array.

Since the sorted array would be [1, 2, 3, 4] the argsorted array contains the indices of these elements in the original.

The smallest value is 1 and it is at index 1 in the original so the first element of the result is 1.
The 2 is at index 2 in the original so the second element of the result is 2.
The 3 is at index 0 in the original so the third element of the result is 0.
The largest value 4 and it is at index 3 in the original so the last element of the result is 3.

Getting the indices that the elements would have in the sorted array/list

In this case you would need to apply argsort twice:

>>> arr = np.array([3, 1, 2, 4])
>>> np.argsort(np.argsort(arr))
array([2, 0, 1, 3], dtype=int64)

In this case :

the first element of the original is 3, which is the third largest value so it would have index 2 in the sorted array/list so the first element is 2.
the second element of the original is 1, which is the smallest value so it would have index 0 in the sorted array/list so the second element is 0.
the third element of the original is 2, which is the second-smallest value so it would have index 1 in the sorted array/list so the third element is 1.
the fourth element of the original is 4 which is the largest value so it would have index 3 in the sorted array/list so the last element is 3.

Upvotes: 10

Jai dewani

Reputation: 163

We will create another array of indexes from 0 to n-1 Then zip this to the original array and then sort it on the basis of the original values

ar = [1,2,3,4,5]
new_ar = list(zip(ar,[i for i in range(len(ar))]))
new_ar.sort()

Upvotes: 2

Dawn

Reputation: 3628

The other answers are WRONG.

Running argsort once is not the solution. For example, the following code:

import numpy as np
x = [3,1,2]
np.argsort(x)

yields array([1, 2, 0], dtype=int64) which is not what we want.

The answer should be to run argsort twice:

import numpy as np
x = [3,1,2]
np.argsort(np.argsort(x))

gives array([2, 0, 1], dtype=int64) as expected.

Upvotes: 9

Rusty Rob

Reputation: 17173

myList = [1, 2, 3, 100, 5]    
sorted(range(len(myList)),key=myList.__getitem__)

[0, 1, 2, 4, 3]

Upvotes: 97

negi

Reputation: 428

Import numpy as np

FOR INDEX

S=[11,2,44,55,66,0,10,3,33]

r=np.argsort(S)

[output]=array([5, 1, 7, 6, 0, 8, 2, 3, 4])

argsort Returns the indices of S in sorted order

FOR VALUE

np.sort(S)

[output]=array([ 0,  2,  3, 10, 11, 33, 44, 55, 66])

Upvotes: 1

mab

Reputation: 2774

If you do not want to use numpy,

sorted(range(len(seq)), key=seq.__getitem__)

is fastest, as demonstrated here.

Upvotes: 8

Matthew Lewis

Reputation: 3140

If you are using numpy, you have the argsort() function available:

>>> import numpy
>>> numpy.argsort(myList)
array([0, 1, 2, 4, 3])

http://docs.scipy.org/doc/numpy/reference/generated/numpy.argsort.html

This returns the arguments that would sort the array or list.

Upvotes: 297

Matt

Reputation: 17629

Updated answer with enumerate and itemgetter:

sorted(enumerate(a), key=lambda x: x[1])
# [(0, 1), (1, 2), (2, 3), (4, 5), (3, 100)]

Zip the lists together: The first element in the tuple will the index, the second is the value (then sort it using the second value of the tuple x[1], x is the tuple)

Or using itemgetter from the operatormodule`:

from operator import itemgetter
sorted(enumerate(a), key=itemgetter(1))

Upvotes: 16

Ant6n

Reputation: 2007

The answers with enumerate are nice, but I personally don't like the lambda used to sort by the value. The following just reverses the index and the value, and sorts that. So it'll first sort by value, then by index.

sorted((e,i) for i,e in enumerate(myList))

Upvotes: 27

Roman Bodnarchuk

Reputation: 29717

Something like next:

>>> myList = [1, 2, 3, 100, 5]
>>> [i[0] for i in sorted(enumerate(myList), key=lambda x:x[1])]
[0, 1, 2, 4, 3]

enumerate(myList) gives you a list containing tuples of (index, value):

[(0, 1), (1, 2), (2, 3), (3, 100), (4, 5)]

You sort the list by passing it to sorted and specifying a function to extract the sort key (the second element of each tuple; that's what the lambda is for. Finally, the original index of each sorted element is extracted using the [i[0] for i in ...] list comprehension.

Upvotes: 221

How to get indices of a sorted array in Python

Answers (19)

Choosing the argsort implementation

Getting the indices that would sort the array/list

Getting the indices that the elements would have in the sorted array/list

Related Questions

Choosing the `argsort` implementation