Warrick
Warrick

Reputation: 727

Sort NumPy float array column by column

Following this trick to grab unique entries for a NumPy array, I now have a two-column array, basically of pairs with first element in the range [0.9:0.02:1.1] and the second element in the range [1.5:0.1:2.0]. Let's call this A. Currently, it's completely unsorted, i.e.

In [111]: A
Out[111]: 
array([[ 1.1 ,  1.9 ],
       [ 1.06,  1.9 ],
       [ 1.08,  1.9 ],
       [ 1.08,  1.6 ],
       [ 0.9 ,  1.8 ],
       ...
       [ 1.04,  1.6 ],
       [ 0.96,  2.  ],
       [ 0.94,  2.  ],
       [ 0.98,  1.9 ]])

I'd like to sort it so that each row first increases in the second column, then the first. i.e.

array([[ 0.9 ,  1.5 ],
       [ 0.9 ,  1.6 ],
       [ 0.9 ,  1.7 ],
       [ 0.9 ,  1.9 ],
       [ 0.9 ,  1.9 ],
       [ 0.9 ,  2.  ],
       [ 0.92,  1.5 ],
       ...
       [ 1.08,  2.  ],
       [ 1.1 ,  1.5 ],
       [ 1.1 ,  1.6 ],
       [ 1.1 ,  1.7 ],
       [ 1.1 ,  1.8 ],
       [ 1.1 ,  1.9 ],
       [ 1.1 ,  2.  ]])

but I can't find a sort algorithm that gives both. As suggested here, I've tried A[A[:,0].argsort()] and A[A[:,1].argsort()], but they only sort one column each. I've also tried applying both but the same thing happens.

I apologize if I've missed something simple but I've been looking for this for a while now...

Upvotes: 9

Views: 5719

Answers (3)

seberg
seberg

Reputation: 8975

Just replace the whole thing (including the unique part with) for A being 2D:

A = np.ascontiguousarray(A) # just to make sure...
A = A.view([('', A.dtype)] * A.shape[1])

A = np.unique(A)
# And if you want the old view:
A = A.view(A.dtype[0]).reshape(-1,len(A.dtype))

I hope you are not using the set solution from the linked question unless you do not care too much about speed. The lexsort etc. is in generally great, but here not necessary since the default sort will do (if its a recarray)


Edit: A different view (with much the same result), but a bit more elegent maybe as no reshape is needed:

A = A.view([('', A.dtype, A.shape[0])])
A = np.unique(A)
# And to go back
A = A.view(A.dtype[0].base)

Upvotes: 3

ecatmur
ecatmur

Reputation: 157424

numpy.lexsort will work here:

A[np.lexsort(A.T)]

You need to transpose A before passing it to lexsort because when passed a 2d array it expects to sort by rows (last row, second last row, etc).

The alternative possibly slightly clearer way is to pass the columns explicitly:

A[np.lexsort((A[:, 0], A[:, 1]))]

You still need to remember that lexsort sorts by the last key first (there's probably some good reason for this; it's the same as performing a stable sort on successive keys).

Upvotes: 7

mgilson
mgilson

Reputation: 310089

the following will work, but there might be a faster way:

A = np.array(sorted(A,key=tuple))

Upvotes: 4

Related Questions