Blerg
Blerg

Reputation: 185

Numpy shuffle rows then sort by one column only

I have a record array with values such as these:

[(1, 3.0)
 (1, 5.0)
 (2, 4.0)
 (2, 7.0)
 (3, 9.0)
 (3, 3.0)]

I need to shuffle the rows and sort the array by the first column. A desired output would be:

[(1, 5.0)
 (1, 3.0)
 (2, 7.0)
 (2, 4.0)
 (3, 9.0)
 (3, 3.0)]

I tried to first shuffle using numpy.random.shuffle(someArray) which worked as expected with a result like the following:

[(3, 3.0)
 (1, 5.0)
 (2, 7.0)
 (1, 3.0)
 (2, 4.0)
 (3, 9.0)]

but then when i sorted using someArray = numpy.sort(someArray, order=['firstColumn']), the result was the first array, sorted by first and second column as well. It is as if i used order=['firstColumn', 'secondColumn'].

Upvotes: 2

Views: 196

Answers (1)

behzad.nouri
behzad.nouri

Reputation: 77951

You may use np.argsort on 1st column and specify mergesort as type of sort. Then use the returned indices to sort the original array:

>>> a
array([(3, 3.0), (1, 5.0), (2, 7.0), (1, 3.0), (2, 4.0), (3, 9.0)], 
      dtype=[('1st', '<i8'), ('2nd', '<f8')])
>>> i = np.argsort(a['1st'], kind='mergesort')
>>> a[i]
array([(1, 5.0), (1, 3.0), (2, 7.0), (2, 4.0), (3, 3.0), (3, 9.0)], 
      dtype=[('1st', '<i8'), ('2nd', '<f8')])

Upvotes: 2

Related Questions