metasequoia
metasequoia

Reputation: 7274

python numpy, sort a matrix by row and column

I would like to use numpy to create a square matrix where rows other than the first are sorted by the contents of the first row. For example:

import numpy as np
a = array([['','z','b','d'],
           ['b','2','5','7'],
           ['d','0','1','3'],
           ['z','3','9','2']])

return:

[['','z','b','d']
 ['z','3','9','2']
 ['b','2','5','7']
 ['d','0','1','3']]

Upvotes: 2

Views: 2378

Answers (2)

tiago
tiago

Reputation: 23492

Here's another way, assuming that what you want is indeed a sort of the rows based on first row:

>>> a[[list(a[:, 0]).index(i) for i in a[0]]]
array([['', 'z', 'b', 'd'],
       ['z', '3', '9', '2'],
       ['b', '2', '5', '7'],
       ['d', '0', '1', '3']], 
       dtype='|S1')

Upvotes: 2

tiago
tiago

Reputation: 23492

It is unclear why you want to have this data in a numpy array, when a dictionary would probably be more appropriate. I assume you want to do some calculations on the data, for which you probably don't want a string dtype.

In your example you want to sort from a key in the first row, presumably strings. If you want to access the array in a 'square' form (e.g. slices like a[:, 2]), this will mean all the elements will be converted to strings. Structured arrays will allow you do do a better sorting, but at the expense of having to do slices like a[:][2]. Here's an example with a structured array that puts your data into an array with a string dtype 'names', and the values as integers in a dtype 'values'. You can do the sorting by the strings in 'names':

a = np.array([('b', [2, 5, 7]),
              ('d', [0, 1, 3]), 
              ('z', [3, 9, 2])],
              dtype=[('names', 'S1'),
                     ('values', '3int')])

You can access the names and the values records separately:

>>> a['names']
array(['b', 'd', 'z'], 
      dtype='|S5')

>>> a['values']
array([[2, 5, 7],
       [0, 1, 3],
       [3, 9, 2]])

And you can sort the values array based on a lexicographic sort of the names:

>>> a['values'][np.argsort(a['names'])]
array([[2, 5, 7],
       [0, 1, 3],
       [3, 9, 2]])

Or just sort the array using another order of the names:

>>> a['values'][np.argsort(['z', 'b', 'd'])]
array([[0, 1, 3],
       [3, 9, 2],
       [2, 5, 7]])

Upvotes: 1

Related Questions