Reputation: 8391
I have two (or more but two is enough to solve the core problem) time series in the form of a list of pairs (time, value)
:
Series x
time 0 2 4 5 8 9
value x0 x2 x4 x5 x8 x9
Series y
time 0 2 3 5 7 8
value y0 y2 y3 y5 y7 y8
What I want to get is a single time series of tuples from the two original series in such a way that where the times align, the values at that time are used and where they do not, the value from the last time that was available is used:
time 0 2 3 4 5 7 8 9
x vals x0 x2 x2 x4 x5 x5 x8 x9
y vals y0 y2 y3 y3 y5 y7 y8 y8
In my concrete environment the two time series are 2-D numpy arrays where in one row holds the times and the other row holds the values. Also it can be assumed that all the series start at the same time (e.g. 0) so they always align at the start.
This can, of course, be solved "manually" by a loop with "pointers" to the current position in the series and moving the pointers. However, is there a simple and elegant way of doing this in numpy?
Upvotes: 0
Views: 236
Reputation: 150735
You do want pandas
for this:
df1 = pd.DataFrame({'time':[0,2,4,5,8,9],
'value':['x0','x2','x4','x5','x8','x9']})
df2 = pd.DataFrame({'time':[0,2,3,5,7,8],
'value':['y0','y2','y3','y5','y7','y8']})
df1.merge(df2, on='time', how='outer').sort_values('time').ffill()
Output:
time value_x value_y
0 0 x0 y0
1 2 x2 y2
6 3 x2 y3
2 4 x4 y3
3 5 x5 y5
7 7 x5 y7
4 8 x8 y8
5 9 x9 y8
Upvotes: 1