bazinga
bazinga

Reputation: 41

Resampling a timeseries data forward average pandas

My df data is like below.


    ts                        Values
    2021-01-01 09:00:00+00:00   0.00
    2021-01-01 09:10:00+00:00   0.01
    2021-01-01 09:20:00+00:00   0.03
    2021-01-01 09:30:00+00:00   0.07
    2021-01-01 09:40:00+00:00   0.09
    2021-01-01 09:50:00+00:00   0.14
    2021-01-01 10:00:00+00:00   0.12
    2021-01-01 10:10:00+00:00   0.14
    2021-01-01 10:20:00+00:00   0.18
    2021-01-01 10:30:00+00:00   0.16
    2021-01-01 10:40:00+00:00   0.14
    2021-01-01 10:50:00+00:00   0.21
    2021-01-01 11:00:00+00:00   0.16

My code is for resampling is:

    df = round(df.resample('1H').mean(), 2).fillna(0)

Here fillna is for filling empty cells forward. So not a big deal. When I run this code, my output is like below:


    ts                        Values
    2021-01-01 09:00:00+00:00   0.06
    2021-01-01 10:00:00+00:00   0.16
    2021-01-01 11:00:00+00:00   0.16
    2021-01-01 12:00:00+00:00   0.07

What I actually want is that take values and date from 09:00 to 09:50 and write the values corresponding to 10:00. But the default is like 09:00 to 09:50 is calculated as 09:00. I want this at 10:00.

The expected output is:


       ts                       Values
    2021-01-01 10:00:00+00:00   0.06
    2021-01-01 11:00:00+00:00   0.16
    2021-01-01 12:00:00+00:00   0.16

Upvotes: 1

Views: 31

Answers (1)

mozway
mozway

Reputation: 260825

You can use:

df['Values'].groupby(df.index.ceil('H')).mean()

output:

ts
2021-01-01 09:00:00+00:00    0.000000
2021-01-01 10:00:00+00:00    0.076667
2021-01-01 11:00:00+00:00    0.165000
Name: Values, dtype: float64

Or to consider 09:00 to be 10:00:

df['Values'].groupby(df.index.floor('H')+pd.Timedelta('1h')).mean()

output:

ts
2021-01-01 10:00:00+00:00    0.056667
2021-01-01 11:00:00+00:00    0.158333
2021-01-01 12:00:00+00:00    0.160000
Name: Values, dtype: float64

Upvotes: 1

Related Questions