Reputation: 1469
I have an numpy array with x
columns and want to sort by multiple columns (some of which may be of type np.str_
. I know that I can do this using np.lexsort
.
Is there a way to specify if ascending / descending order for each sorting column?
Example: I know that I can sort via multiple columns as follows: (EDITED TO INDICATE STRING COLUMNS!
import numpy as np
arr = np.array([list("aaabbb"),[1,2,3,1,4,3],[1,2,3,4,6,6]]).T # Define arr
idx = np.lexsort([arr[:,1], arr[:,2]]) # sort by column 2 and then by column 1 (i.e. reversed order)
arr = arr[idx]
I also understand that I can sort in descending order as follows:
arr = arr[idx[::-1]]
This results in first sort column (column 2) to be descending and the subsequent columns in ascending order.
But how do I specify that I want first sort (column 2) in descending order and subsequent sort (column 1) descending order so that I get the following.
desired OUTPUT:
array([['b', 4, 6],
['b', 3, 6],
['b', 1, 4],
['a', 3, 3],
['a', 2, 2],
['a', 1, 1]]
So basically for my example I am looking something equivalent to:
df = pd.DataFrame(arr, columns=list("abc"))
df.sort_values(by=["c","b"], ascending=[False, False])
In general I want to be able to specifcy (i) columns to be sorted and (ii) sorting order per column (ascending/descending).
Upvotes: 3
Views: 1654
Reputation: 71610
Try using the minus sign for reversing, it basically sorts by the negative of each value, which actually works:
idx = np.lexsort([arr[:,1], -arr[:,2]])
Output:
array([[2, 3, 6],
[2, 4, 6],
[2, 1, 4],
[1, 3, 3],
[1, 2, 2],
[1, 1, 1]])
If there would be strings, try:
idx = np.lexsort([arr[:,1], arr[:,2]])
arr = arr[idx]
arr[:, 0] = arr[:, 0][::-1]
Output:
array([['2', '1', '1'],
['2', '2', '2'],
['2', '3', '3'],
['1', '1', '4'],
['1', '3', '6'],
['1', '4', '6']],
dtype='<U11')
Edit:
With the new edit:
arr = arr[::-1]
Would work:
array([['b', '4', '6'],
['b', '3', '6'],
['b', '1', '4'],
['a', '3', '3'],
['a', '2', '2'],
['a', '1', '1']],
dtype='<U1')
Upvotes: 5