Reputation: 950
In order to combine time series data, I am left with the following essential step:
>>> xs1
array([ 0, 10, 12, 16, 25, 29])
>>> xs2
array([ 0, 5, 10, 15, 20, 25, 30])
How to best get the following solutions:
>>> xs1_ = np.array([0,0,10,12,12,16,16,25,29,29])
>>> xs2_ = np.array([0,5,10,10,15,15,20,25,25,30])
This is to align the measurements taken at times x1
and x2
.
Imagine that the measurement from series xs1
at time 0
is valid until the next measurement in this series has been made, which is time 10
. We could interpolate both series to their greatest common divisor, but that is most likely 1
and creates a huge bloat. Therefore it would be better to have an interpolation only for the union of xs1
and xs2
. In xs1_
and xs2_
are aligned by list index the x-values to compare. I.e. we compare time 5
in series xs2_
with time 0
in series xs1_
as the next measurement in series xs1_
is only later, at time 10
. From a visual point of view, imagine a step plot for both measurements (the y-values are not shown here) where we always compare the lines laying above each other.
Although I am struggling how to name this task, I believe it is a problem of general interest and therefore think it is appropriate to ask here for its best solution.
Upvotes: 1
Views: 662
Reputation: 88236
Here's a vectorised approach:
xs1 = np.array([ 0, 10, 12, 16, 25, 29])
xs2 = np.array([ 0, 5, 10, 15, 20, 25, 30])
# union of both sets
xs = np.array(sorted(set(xs1) | set(xs2)))
# array([ 0, 5, 10, 12, 15, 16, 20, 25, 29, 30])
xs1_ = np.maximum.accumulate(np.in1d(xs, xs1) * xs)
print(xs1_)
array([ 0, 0, 10, 12, 12, 16, 16, 25, 29, 29])
xs2_ = np.maximum.accumulate(np.in1d(xs, xs2) * xs)
print(xs_2)
array([ 0, 5, 10, 10, 15, 15, 20, 25, 25, 30])
Where, for both cases:
np.in1d(xs, xs1) * xs
# array([ 0, 0, 10, 12, 0, 16, 0, 25, 29, 0])
Is giving an array with the values in in xs
contained in xs1
and 0
for those that aren't. We just need to forward fill using np.maximum.accumulate
.
Upvotes: 1
Reputation: 426
Here is my proposition:
a=np.array([0,10,12,16,25,29])
b=np.array([0,5,10,15,20,25,30])
c=set(a).union(b)
#c = {0, 5, 10, 12, 15, 16, 20, 25, 29, 30}
xs1_= [max([i for i in a if i<=j]) for j in c]
# [0, 0, 10, 12, 12, 16, 16, 25, 29, 29]
xs2 = [max([i for i in b if i<=j]) for j in c]
# [0, 5, 10, 10, 15, 15, 20, 25, 25, 30]
1) a and b are your two first list.
2) c is a set which represents the union between your two arrays. By doing this, you get all the value present in both array.
3) Then, for each element of this set, I will select the maximum of the value present in a or b, which remain smaller than or equal to this element.
Upvotes: 2