Reputation: 5895
Given 2 numpy arrays of unequal size: A (a presorted dataset) and B (a list of query values). I want to find the closest "lower" neighbor in array A to each element of array B. Example code below:
import numpy as np
A = np.array([0.456, 2.0, 2.948, 3.0, 7.0, 12.132]) #pre-sorted dataset
B = np.array([1.1, 1.9, 2.1, 5.0, 7.0]) #query values, not necessarily sorted
print A.searchsorted(B)
# RESULT: [1 1 2 4 4]
# DESIRED: [0 0 1 3 4]
In this example, B[0]'s closest neighbors are A[0] and A[1]. It is closest to A[1], which is why searchsorted returns index 1 as a match, but what i want is the lower neighbor at index 0. Same for B[1:4], and B[4] should be matched with A[4] because both values are identical.
I could do something clunky like this:
desired = []
for b in B:
id = -1
for a in A:
if a > b:
if id == -1:
desired.append(0)
else:
desired.append(id)
break
id+=1
print desired
# RESULT: [0, 0, 1, 3, 4]
But there's gotta be a prettier more concise way to write this with numpy. I'd like to keep my solution in numpy because i'm dealing with large data sets, but i'm open to other options.
Upvotes: 2
Views: 1169
Reputation: 221564
You can introduce the optional argument side
and set it to 'right'
as mentioned in the docs
. Then, subtract the final indices result by 1
for the desired output, like so -
A.searchsorted(B,side='right')-1
Sample run -
In [63]: A
Out[63]: array([ 0.456, 2. , 2.948, 3. , 7. , 12.132])
In [64]: B
Out[64]: array([ 1.1, 1.9, 2.1, 5. , 7. ])
In [65]: A.searchsorted(B,side='right')-1
Out[65]: array([0, 0, 1, 3, 4])
In [66]: A.searchsorted(A,side='right')-1 # With itself
Out[66]: array([0, 1, 2, 3, 4, 5])
Upvotes: 3
Reputation: 514
Here's one way to do this. np.argmax stops at the first True it encounters, so as long as A is sorted this provides the desired result.
[np.argmax(A>b)-1 for b in B]
Edit: I got the inequality wrong initially, it works now.
Upvotes: 1