Reputation: 13651
I have two arrays:
a =
[[ 461. 0. ]
[ 480. 15. ]
[ 463. 28. ]]
and
b =
[[ 463. 0. ]
[ 462. 8. ]
[ 466. 15. ]
[ 469. 22. ]
[ 470. 28. ]
[ 473. 34. ]]
I need a resulting array comprised of a
minus b
only if the second column of a => [0 15 28]
is in the second column of b => [0 8 15 22 28 34]
. All elements of the second column of a
will be in the second column of b
, I just want to discard those in b
that don't exist in a
. The expected result is:
result =
[[ -2. 0. ]
[ 14. 15. ]
[ -7. 28. ]]
To begin, I thought of getting the 'subarray' of b
that contains just the rows I'm interested in. Among many other things, the one I thought would work (and didn't) was this:
result = b[b[:, 1] in a[:, 1]] # not working
Any help is welcome.
Upvotes: 2
Views: 2423
Reputation: 104555
This algorithm works under the following assumptions:
a
is a subset of the second column of b
. This means that we are guaranteed to find a value in the second column of b
given a value in the second column of a
.a
and b
are sorted.a
and b
.Use numpy.in1d
to figure out if the corresponding value in the second column of b
can be found in a
. You can then use this Boolean array to slice into b
and do your subtraction with the first column of a
and the first column of sliced result of b
. The reason why this works is because of the nature of the sorted order in b
. When slicing into this array in conjunction with numpy.in1d
, you are guaranteed to have the second column of this sliced result match up exactly in values with the first column of a
. Once you have this alignment, you can subtract the first column of this sliced result with the first column of a
. To finish things up, you can copy over the second column of the sliced values of b
and stack both of these together:
In [119]: import numpy as np
In [120]: a = np.array([[461,0],[480,15],[463,28]], dtype=np.float)
In [121]: b = np.array([[463,0], [462,8], [466,15], [469,22], [470,28], [473,34]], dtype=np.float)
In [122]: ind = np.in1d(b[:,1], a[:,1])
In [123]: np.column_stack([a[:,0]-b[ind,0], b[ind,1]])
Out[123]:
array([[ -2., 0.],
[ 14., 15.],
[ -7., 28.]])
What is returned from numpy.in1d
is a Boolean array that tells you whether the ith value in the first input of numpy.in1d
can be found anywhere in the second input of numpy.in1d
. To see what this looks like, given your data, we get:
In [124]: ind
Out[124]: array([ True, False, True, False, True, False], dtype=bool)
As you can see, both the first, third and fifth values in b
can be found in a
. We simply slice into b
and extract the right rows and these rows of the sliced result will have the second column values line up exactly with those second column values in a
. We then subtract the first columns of both a
and the intermediate result together.
A more clean approach would be to slice into b
and extract the entire matrix instead of just the first column, then just subtract the first column with a
and this intermediate result:
In [125]: out = b[ind]
In [126]: out[:,0] = a[:,0] - out[:,0]
In [127]: out
Out[127]:
array([[ -2., 0.],
[ 14., 15.],
[ -7., 28.]])
Upvotes: 5
Reputation: 724
Given the conditions in your comments, the following should work:
def calculate_diffs(b, a):
brow = b[:, 1]
arow = a[:, 1]
# Find common indices. For your example, indices == [0, 2, 4]
indices = numpy.searchsorted(brow, arow)
r = a.copy()
r[:, 0] = b[indices, 0] - a[:, 0]
return r
Upvotes: 1