Reputation: 558
I have a compact DataFrame with unevenly distributed timestamps that I want to expand to a one second interval between each time stamp and in the (NaN) expanded rows, I want to fill the average between the preceding and succeeding value from the compact dataframe. The compact df looks like this:
tv = ["2021-02-10 21:00:00", "2021-02-10 21:00:01", "2021-02-10 21:00:4",
"2021-03-10 21:00:10", "2021-03-10 21:00:12","2021-04-10 21:05:15",
"2021-04-10 21:05:19"]
c1 = [1,14,7,3,8,9,10]
c2 = [4,5,6,1,8,5,3]
df = pd.DataFrame(list(zip(c1, c2)),
index=pd.DatetimeIndex(tv, dtype='datetime64[ns, Europe/Amsterdam]'),
columns=['C1', 'C2'])
df.index.names = ['timestamp']
Please, look that sometimes the delta-time between records can be more than a day. What I want to achive is having a dataframe that looks like this:
timestamp C1 C2
2021-02-10 21:00:00 1.0 4.0
2021-02-10 21:00:01 14.0 5.0
2021-02-10 21:00:02 (14+7)/2 (5+6)/2
2021-02-10 21:00:03 (14+7)/2 (5+6)/2
2021-02-10 21:00:04 7.0 6.0
... ... ...
2021-04-10 21:05:15 9.0 5.0
2021-04-10 21:05:16 (9+10)/2 (5+3)/2
2021-04-10 21:05:17 (9+10)/2 (5+3)/2
2021-04-10 21:05:18 (9+10)/2 (5+3)/2
2021-04-10 21:05:19 10.0 3.0
Where the operations in parenthesis are just to indicate, which operations should be performed. i.e. (14+7)/2 = 10.5 or (9+10)/2 = 9.5.
I have found a very related post here but the answer was accepted by using DataFrame.interpolate()
and I really do not want to use linear interpolation until now. I haven't found any way to achieve the result with DataFrame.mean()
.
The furthest I have achieved is the following code:
df1s = df.asfreq('1S')
dfb = df1s.fillna(method='bfill')
dff = df1s.fillna(method='ffill')
dfmeans = (dfb+dff)/2
Which produces the desired result. But I wonder if there is a more pythonic way to achieve the same results... Thanks.
Upvotes: 0
Views: 133
Reputation: 862841
I think your solution is pythonic, you can use ffill
and bfill
instead fillna
:
df1s = df.asfreq('1S')
dfmeans = (df1s.bfill()+df1s.ffill())/2
Or:
df1s = df.asfreq('1S')
dfmeans = df1s.bfill().add(df1s.ffill()).div(2)
Upvotes: 1