Reputation: 1050
I have two numpy arrays. One is N by M the other is N by 1. I want to be able to sort the first list by any one of it's M dimensions, and I want the lists to keep the same order (i.e. if I swap rows 1 and 15 of list1, I want rows 1 and 15 of list2 to swap too.)
For example:
import numpy as np
a = np.array([[1,6],[3,4],[2,5]])
b = np.array([[.5],[.8],[.2]])
Then, I'd like to be able to sort by, say, the first element of each row in a
to give:
a = [[1,6],[2,5],[3,4]]
b = [[.5],[.2],[.8]]
or to sort by, say, the second element of each row in a
to give:
a = [[3,4],[2,5],[1,6]]
b = [[.8],[.2],[.5]
I see lots of similar problems in which both lists are single dimensional like, e.g, this question. Or questions about sorting lists of lists, e.g., this one. But I can't find what I'm looking for.
Eventually I got this to work:
import numpy as np
a = np.array([[1,6],[3,4],[2,5]])
b = np.array([[.5],[.8],[.2]])
package = zip(a,b)
print package[0][1]
sortedpackage= sorted(package, key=lambda dim: dim[0][1])
d,e = zip(*sortedpackage)
print d
print e
Now this produces d and e as I want:
d = [[3,4],[2,5],[1,6]]
e = [[.8],[.2],[.5]
But I don't understand why. The print package[0][1]
gives 0.5 -- which is not the element I'm sorting by. Why is this? Is what I'm doing robust?
Upvotes: 2
Views: 191
Reputation: 26397
The reason print package[0][1]
returns 0.5
is because it is accessing the numbers in your list of tuples "as a whole" whereas sorted
is looking at each individual element of the given iterable.
You zip a
and b
in package
:
[([1, 6], [0.5]),
([3, 4], [0.8]),
([2, 5], [0.2])]
It is at this point that you print package[0][1]
. The first element is obtained with package[0]
= ([1, 6], [0.5])
. The next index [1]
gives you the second element of the first tuple, thus you get 0.5
.
Considering sorted
, the function is examining the elements of the iterable, individually. It may first look at ([1, 6], [0.5])
, then ([3, 4], [0.8])
, and so on.
So when you specify a key with a lambda
function you are really saying, for this particular element of the iterable, get the value at [0][1]
. That is, sort by the second value of of the first element of the given tuple (the second value of a
).
Upvotes: 2
Reputation: 11543
inside your package
:
package[0]
is (a[0], b[0])
thus, package[0][1]
is b[0].
your package is triple-nested. key=lambda dim : dim[0][1]
means you use item[0][1]
as a key to sort package
. package
consists of item
, and item
is is double-nested.
to see what element you're sorting by, use package[x][0][1]
x being index of that item
Upvotes: 1
Reputation: 414149
To apply the same sort order to several numpy arrays, you could use np.argsort()
. For example, to sort by the second column:
indices = a[:,1].argsort()
print(a[indices])
print(b[indices])
Output:
[[3 4]
[2 5]
[1 6]]
[[ 0.8]
[ 0.2]
[ 0.5]]
Upvotes: 2