Ruairi O'Sullivan
Ruairi O'Sullivan

Reputation: 61

Minimum distance for each value in array respect to other

I have two numpy arrays of integers A and B. The values in array A and B correspond to time-points at which events A and B occurred. I would like to transform A to contain the time since the most recent event b occurred.

I know I need to subtract each element of A by its nearest smaller the element of B but am unsure of how to do so. Any help would be greatly appreciated.

>>> import numpy as np

>>> A = np.array([11, 12, 13, 17, 20, 22, 33, 34])
>>> B = np.array([5, 10, 15, 20, 25, 30])

Desired Result:

cond_a = relative_timestamp(to_transform=A, reference=B)
cond_a
>>> array([1, 2, 3, 2, 0, 2, 3, 4])

Upvotes: 4

Views: 898

Answers (3)

yatu
yatu

Reputation: 88276

Here's an approach consisting on computing the pairwise differences. Note that it has a O(n**2) complexity so it might for larger arrays @brenlla's answer will perform much better.

The idea here is to use np.subtract.outer and then find the minimum difference along axis 1 over a masked array, where only values in B smaller than a are considered:

dif = np.abs(np.subtract.outer(A,B))
np.ma.array(dif, mask = A[:,None] < B).min(1).data
# array([1, 2, 3, 2, 0, 2, 3, 4])

Upvotes: 1

Brenlla
Brenlla

Reputation: 1481

You can use np.searchsorted to find the indices where the elements of A should be inserted in B to maintain order. In other words, you are finding the closest elemet in B for each element in A:

idx = np.searchsorted(B, A, side='right')
result = A-B[idx-1] # substract one for proper index

According to the docs searchsorted uses binary search, so it will scale fine for large inputs.

Upvotes: 2

Sparky05
Sparky05

Reputation: 4892

As I am not sure, if it is really faster to calculate all pairwise differences, instead of a python loop over each array entry (worst case O(Len(A)+len(B)), the solution with a loop:

A = np.array([11, 12, 13, 17, 20, 22, 33, 34])
B = np.array([5, 10, 15, 20, 25, 30])

def calculate_next_distance(to_transform, reference):
    max_reference = len(reference) - 1
    current_reference = 0
    transformed_values = np.zeros_like(to_transform)
    for i, value in enumerate(to_transform):
        while current_reference < max_reference and reference[current_reference+1] <= value:
            current_reference += 1
        transformed_values[i] = value - reference[current_reference]
    return transformed_values

calculate_next_distance(A,B)
# array([1, 2, 3, 2, 0, 2, 3, 4])

Upvotes: 0

Related Questions