Kdog
Kdog

Reputation: 513

Sorting numpy array with lexsort. Alternative to pandas sort_values

I would like the order of b to equal the result of a below.

The result of b is not as I expected. I thought it would be sorting by the columns in ascending order but I think I misunderstood how lexsort works.

My goal is to be able to sort an array the way the df below is sorted. I'm using lexsort because I think it would be the best thing to use for an array that also contained categorical values.

import numpy as np
import pandas as pd

x = np.array([[1,2],[1,3],[1,4],[2,1],[2,2],[2,3],[2,4],[0,1],[1,0],[0,2]])

a=pd.DataFrame(x).sort_values(by=[0,1], ascending=[0,1])

b=x[np.lexsort((x[:,1],x[:,0][::-1]))]

print(a)
print(b)

Upvotes: 1

Views: 641

Answers (1)

sammywemmy
sammywemmy

Reputation: 28709

From the docs, it should be last, first to get the sort order:

sorter = np.lexsort((x[:, 1], x[:, 0]))

x[sorter][::-1] # sorting in descending order
Out[899]: 
array([[2, 4],
       [2, 3],
       [2, 2],
       [2, 1],
       [1, 4],
       [1, 3],
       [1, 2],
       [1, 0],
       [0, 2],
       [0, 1]])

To simulate descending on one end, with ascending on the other, you could combine np.unique, with np.split and np.concatenate:

temp = x[sorter]
_, s = np.unique(temp[:, 0], return_counts=True)
np.concatenate(np.split(temp, s.cumsum())[::-1])
Out[972]: 
array([[2, 1],
       [2, 2],
       [2, 3],
       [2, 4],
       [1, 0],
       [1, 2],
       [1, 3],
       [1, 4],
       [0, 1],
       [0, 2]])

Upvotes: 2

Related Questions