eliasmaxil
eliasmaxil

Reputation: 558

Average between values with unevenly distributed time in Pandas DataFrame

I have a compact DataFrame with unevenly distributed timestamps that I want to expand to a one second interval between each time stamp and in the (NaN) expanded rows, I want to fill the average between the preceding and succeeding value from the compact dataframe. The compact df looks like this:

tv = ["2021-02-10 21:00:00", "2021-02-10 21:00:01", "2021-02-10 21:00:4", 
      "2021-03-10 21:00:10", "2021-03-10 21:00:12","2021-04-10 21:05:15", 
      "2021-04-10 21:05:19"]
c1 = [1,14,7,3,8,9,10]
c2 = [4,5,6,1,8,5,3] 
df = pd.DataFrame(list(zip(c1, c2)), 
            index=pd.DatetimeIndex(tv, dtype='datetime64[ns, Europe/Amsterdam]'), 
            columns=['C1', 'C2'])
df.index.names = ['timestamp']

Please, look that sometimes the delta-time between records can be more than a day. What I want to achive is having a dataframe that looks like this:

timestamp            C1         C2  
2021-02-10 21:00:00  1.0        4.0
2021-02-10 21:00:01  14.0      5.0
2021-02-10 21:00:02  (14+7)/2  (5+6)/2
2021-02-10 21:00:03  (14+7)/2  (5+6)/2
2021-02-10 21:00:04  7.0        6.0
... ... ...
2021-04-10 21:05:15  9.0       5.0
2021-04-10 21:05:16  (9+10)/2  (5+3)/2
2021-04-10 21:05:17  (9+10)/2  (5+3)/2
2021-04-10 21:05:18  (9+10)/2  (5+3)/2
2021-04-10 21:05:19  10.0      3.0

Where the operations in parenthesis are just to indicate, which operations should be performed. i.e. (14+7)/2 = 10.5 or (9+10)/2 = 9.5.

I have found a very related post here but the answer was accepted by using DataFrame.interpolate() and I really do not want to use linear interpolation until now. I haven't found any way to achieve the result with DataFrame.mean().

The furthest I have achieved is the following code:

df1s = df.asfreq('1S')
dfb = df1s.fillna(method='bfill')
dff = df1s.fillna(method='ffill')
dfmeans = (dfb+dff)/2

Which produces the desired result. But I wonder if there is a more pythonic way to achieve the same results... Thanks.

Upvotes: 0

Views: 133

Answers (1)

jezrael
jezrael

Reputation: 862841

I think your solution is pythonic, you can use ffill and bfill instead fillna:

df1s = df.asfreq('1S')
dfmeans = (df1s.bfill()+df1s.ffill())/2

Or:

df1s = df.asfreq('1S')
dfmeans = df1s.bfill().add(df1s.ffill()).div(2)

Upvotes: 1

Related Questions