Reputation: 145
I would like to sort an entire matrix and get as a result a vector of indices. In R this can be simply achieved through the command
order("matrix")
In Python the analogous command would be:
np.argsort("matrix", axis=None)
The problem is that while R recognizes the index of a vector also for matrices (i.e. matrix[1]
gives a number, not a row), in Python we seem to need always both indeces.
So I had to follow the steps in this question. But I need to change the code, since zip()
seems to work correctly only together with list()
in Python 3+ (or am I wrong?). The new code would be
import numpy as np
a = np.array([[1,4,3],[2,8,0]])
n = a.shape[1]
d = list(zip( np.argsort(a, axis=None).__divmod__(n)))
d = list(zip(d[0][0], d[1][0]))
But the R code is so simple! Hence I wonder whether it is not possible to find a similarly simple command in Python. Or whether there is some similarly convenient way of treating a matrix also as a vector.
Upvotes: 3
Views: 1689
Reputation: 231355
Your d
is
In [362]: d
Out[362]: [(1, 2), (0, 0), (1, 0), (0, 2), (0, 1), (1, 1)]
argsort
with None
produces the sorted indices for a flattened version of the array:
In [366]: np.argsort(a, axis=None)
Out[366]: array([5, 0, 3, 2, 1, 4], dtype=int32)
In [368]: a.flat[_] # apply this index to flattened a
Out[368]: array([0, 1, 2, 3, 4, 8])
np.argsort(a.flat)
does the same thing. (or use a.ravel()
if that makes more sense to you.)
We can unravel
the 1d indices:
In [369]: np.unravel_index(Out[366], a.shape)
Out[369]:
(array([1, 0, 1, 0, 0, 1], dtype=int32),
array([2, 0, 0, 2, 1, 1], dtype=int32))
This uses the same sort of divmod
calculation that you do. It can be used to index a
directly:
In [370]: a[_]
Out[370]: array([0, 1, 2, 3, 4, 8])
transpose
can turn that tuple of arrays into a 2d array similar to your d
:
In [371]: np.transpose(__)
Out[371]:
array([[1, 2],
[0, 0],
[1, 0],
[0, 2],
[0, 1],
[1, 1]], dtype=int32)
(np.argwhere
does the same thing with the results of np.where
.)
Or in one line
In [375]: np.transpose(np.unravel_index(np.argsort(a, axis=None), a.shape)).tolist()
Out[375]: [[1, 2], [0, 0], [1, 0], [0, 2], [0, 1], [1, 1]]
Note that is type of list of index pairs cannot be used to index a
. Out[369]
is the correct form for indexing items in a 2d array. I'm not familiar with R
, but it looks like its indexing style may be more like MATLAB.
Upvotes: 2
Reputation:
in Python we seem to need always both indeces
This is not true. You can access the elements of a 2D array (or an N-D array) with a flat index, like the indexes 5, 0, 3, 2, 1, 4 returned by argsort
. This can be done by using ravel
, which creates a one-dimensional view into your n-dimensional array.
a = np.array([[1, 4, 3], [2, 8, 0]])
ix = np.argsort(a, axis=None)
print(a.ravel()[ix])
This prints [0, 1, 2, 3, 4, 8], the array elements are returned according to the indexes that argsort produced.
This can also be used for writing into the array: for example,
a.ravel()[ix[0]] = 42
replaces the smallest element of the array with 42.
Note that passing axis=None
makes argsort
implicitly flattened the array before sorting, an equivalent of ravel
method.
Upvotes: 2