Reputation: 939
I have a n x 2 matrix of integers. The first column is a series 0,1,-1,2,-2, however these are in the order that they were compiled in from their constituent matrices. The second column is a list of indices from another list.
I would like to sort the matrix via this second column. This would be equivalent to selecting two columns of data in Excel, and sorting via Column B (where the data is in columns A and B). Keep in mind, the adjacent data in the first column of each row should be kept with its respective second column counterpart. I have looked at solutions using the following:
data[np.argsort(data[:, 0])]
But this does not seem to work. The matrix in question looks like this:
matrix([[1, 1],
[1, 3],
[1, 7],
...,
[2, 1021],
[2, 1040],
[2, 1052]])
Upvotes: 3
Views: 25646
Reputation: 25813
You had the right idea, just off by a few characters:
>>> import numpy as np
>>> data = np.matrix([[9, 8],
... [7, 6],
... [5, 4],
... [3, 2],
... [1, 0]])
>>> data[np.argsort(data.A[:, 1])]
matrix([[1, 0],
[3, 2],
[5, 4],
[7, 6],
[9, 8]])
Upvotes: 1
Reputation: 879351
You could use np.lexsort:
numpy.lexsort(keys, axis=-1)
Perform an indirect sort using a sequence of keys.
Given multiple sorting keys, which can be interpreted as columns in a spreadsheet, lexsort returns an array of integer indices that describes the sort order by multiple columns.
In [13]: data = np.matrix(np.arange(10)[::-1].reshape(-1,2))
In [14]: data
Out[14]:
matrix([[9, 8],
[7, 6],
[5, 4],
[3, 2],
[1, 0]])
In [15]: temp = data.view(np.ndarray)
In [16]: np.lexsort((temp[:, 1], ))
Out[16]: array([4, 3, 2, 1, 0])
In [17]: temp[np.lexsort((temp[:, 1], ))]
Out[17]:
array([[1, 0],
[3, 2],
[5, 4],
[7, 6],
[9, 8]])
Note if you pass more than one key to np.lexsort
, the last key is the primary key. The next to last key is the second key, and so on.
Using np.lexsort
as I show above requires the use of a temporary array because np.lexsort
does not work on numpy matrices. Since
temp = data.view(np.ndarray)
creates a view, rather than a copy of data
, it does not require much extra memory. However,
temp[np.lexsort((temp[:, 1], ))]
is a new array, which does require more memory.
There is also a way to sort by columns in-place. The idea is to view the array as a structured array with two columns. Unlike plain ndarrays, structured arrays have a sort
method which allows you to specify columns as keys:
In [65]: data.dtype
Out[65]: dtype('int32')
In [66]: temp2 = data.ravel().view('int32, int32')
In [67]: temp2.sort(order = ['f1', 'f0'])
Notice that since temp2
is a view of data
, it does not require allocating new memory and copying the array. Also, sorting temp2
modifies data
at the same time:
In [69]: data
Out[69]:
matrix([[1, 0],
[3, 2],
[5, 4],
[7, 6],
[9, 8]])
Upvotes: 3