Reputation: 10203

Find absolute minimum between two numpy arrays but keep the sign

Consider two arrays of different length:

A = np.array([58, 22, 86, 37, 64])

B = np.array([105, 212,   5, 311, 253, 419, 123, 461, 256, 464])

For each value in A, I want to find the smallest absolute difference between values in A and B. I use Pandas because my actual arrays are subsets of Pandas dataframes but also because the apply method is a convenient (albeit slow) approach to taking the difference between two different-sized arrays:

In [22]: pd.Series(A).apply(lambda x: np.min(np.abs(x-B)))
Out[22]:
0    47
1    17
2    19
3    32
4    41
dtype: int64

BUT I also want to keep the sign, so the desired output is:

0    -47
1     17
2    -19
3     32
4    -41
dtype: int64

[update] my actual arrays A and B are approximately of 5e4 and 1e6 in length so a low memory solution would be ideal. Also, I wish to avoid using Pandas because it is very slow on the actual arrays.

Upvotes: 3

Answers (4)

piRSquared

Reputation: 294258

Comprehension

I couldn't help myself. This is not what you should do! But, it is cute.

[min(x - B, key=abs) for x in A]

[-47, 17, -19, 32, -41]

Reduced Big-O Numpy solution

If N = len(A) and M = len(B) then this solution should be O(N + M log(M)) If B is already sorted, then the sorting step is unnecessary. and this becomes O(N + M)

C          = np.sort(B)
a          = C.searchsorted(A)

# It is possible that `i` has a value equal to the length of `C`
# in which case the searched value exceeds all those found in `C`.
# In that case, we want to clip the index value to the right most index
# which is `len(C) - 1`
right      = np.minimum(a, len(C) - 1)

# For those searched values that are less than the first value in `C`
# the index value will be `0`.  When I subtract `1`, we'll end up with
# `-1` which makes no sense.  So we clip it to `0`.
left       = np.maximum(a - 1, 0)

For clipped values, we'll end up comparing a value to itself and therefore it is safe.

right_diff = A - C[right]
left_diff  = A - C[left ]

np.where(np.abs(right_diff) <= left_diff, right_diff, left_diff)

array([-47,  17, -19,  32, -41])

Upvotes: 3

cs95

Reputation: 402463

Let's use broadcasted subtraction here. We then use argmin to find the absolute minimum, then extract the values in a subsequent step.

u = A[:,None] - B
idx = np.abs(u).argmin(axis=1)

u[np.arange(len(u)), idx]
# array([-47,  17, -19,  32, -41])

This uses pure NumPy broadcasting, so it should be quite fast.

Upvotes: 5

Quang Hoang

Reputation: 150735

Since you tagged pandas:

# compute the diff by broadcasting
diff = pd.DataFrame(A[None,:] - B[:,None])
# mininum value
min_val = diff.abs().min()

# mask with where and stack to drop na
diff.where(diff.abs().eq(min_val)).stack()

Output:

0  0   -47.0
   2   -19.0
   4   -41.0
2  1    17.0
   3    32.0
dtype: float64

Upvotes: 2

Imperishable Night

Reputation: 1533

np.argmin can find the position of the minimum value. Therefore you can simply do this:

pd.Series(A).apply(lambda x: x-B[np.argmin(np.abs(x-B))])

Upvotes: 2

Find absolute minimum between two numpy arrays but keep the sign

Answers (4)

Comprehension

Reduced Big-O Numpy solution

Related Questions