Radio Controlled
Radio Controlled

Reputation: 950

Numpy: How to best align two sorted arrays?

In order to combine time series data, I am left with the following essential step:

>>> xs1
array([ 0, 10, 12, 16, 25, 29])
>>> xs2
array([ 0,  5, 10, 15, 20, 25, 30])

How to best get the following solutions:

>>> xs1_ = np.array([0,0,10,12,12,16,16,25,29,29])
>>> xs2_ = np.array([0,5,10,10,15,15,20,25,25,30])

This is to align the measurements taken at times x1 and x2.

Imagine that the measurement from series xs1 at time 0 is valid until the next measurement in this series has been made, which is time 10. We could interpolate both series to their greatest common divisor, but that is most likely 1 and creates a huge bloat. Therefore it would be better to have an interpolation only for the union of xs1 and xs2. In xs1_ and xs2_ are aligned by list index the x-values to compare. I.e. we compare time 5 in series xs2_ with time 0 in series xs1_ as the next measurement in series xs1_ is only later, at time 10. From a visual point of view, imagine a step plot for both measurements (the y-values are not shown here) where we always compare the lines laying above each other.

Although I am struggling how to name this task, I believe it is a problem of general interest and therefore think it is appropriate to ask here for its best solution.

Upvotes: 1

Views: 662

Answers (2)

yatu
yatu

Reputation: 88236

Here's a vectorised approach:

xs1 = np.array([ 0, 10, 12, 16, 25, 29])
xs2 = np.array([ 0,  5, 10, 15, 20, 25, 30])

# union of both sets
xs = np.array(sorted(set(xs1) | set(xs2)))
# array([ 0,  5, 10, 12, 15, 16, 20, 25, 29, 30])

xs1_ = np.maximum.accumulate(np.in1d(xs, xs1) * xs)
print(xs1_)
array([ 0,  0, 10, 12, 12, 16, 16, 25, 29, 29])

xs2_ = np.maximum.accumulate(np.in1d(xs, xs2) * xs)
print(xs_2)
array([ 0,  5, 10, 10, 15, 15, 20, 25, 25, 30])

Where, for both cases:

np.in1d(xs, xs1) * xs
# array([ 0,  0, 10, 12,  0, 16,  0, 25, 29,  0])

Is giving an array with the values in in xs contained in xs1 and 0 for those that aren't. We just need to forward fill using np.maximum.accumulate.

Upvotes: 1

HappyCloudNinja
HappyCloudNinja

Reputation: 426

Here is my proposition:

a=np.array([0,10,12,16,25,29])
b=np.array([0,5,10,15,20,25,30]) 
c=set(a).union(b) 
#c = {0, 5, 10, 12, 15, 16, 20, 25, 29, 30}
xs1_= [max([i for i in a if i<=j]) for j in c]
# [0, 0, 10, 12, 12, 16, 16, 25, 29, 29]
xs2 = [max([i for i in b if i<=j]) for j in c]
# [0, 5, 10, 10, 15, 15, 20, 25, 25, 30]

1) a and b are your two first list.
2) c is a set which represents the union between your two arrays. By doing this, you get all the value present in both array.
3) Then, for each element of this set, I will select the maximum of the value present in a or b, which remain smaller than or equal to this element.

Upvotes: 2

Related Questions