Reputation: 41
My df data is like below.
ts Values
2021-01-01 09:00:00+00:00 0.00
2021-01-01 09:10:00+00:00 0.01
2021-01-01 09:20:00+00:00 0.03
2021-01-01 09:30:00+00:00 0.07
2021-01-01 09:40:00+00:00 0.09
2021-01-01 09:50:00+00:00 0.14
2021-01-01 10:00:00+00:00 0.12
2021-01-01 10:10:00+00:00 0.14
2021-01-01 10:20:00+00:00 0.18
2021-01-01 10:30:00+00:00 0.16
2021-01-01 10:40:00+00:00 0.14
2021-01-01 10:50:00+00:00 0.21
2021-01-01 11:00:00+00:00 0.16
My code is for resampling is:
df = round(df.resample('1H').mean(), 2).fillna(0)
Here fillna is for filling empty cells forward. So not a big deal. When I run this code, my output is like below:
ts Values
2021-01-01 09:00:00+00:00 0.06
2021-01-01 10:00:00+00:00 0.16
2021-01-01 11:00:00+00:00 0.16
2021-01-01 12:00:00+00:00 0.07
What I actually want is that take values and date from 09:00 to 09:50 and write the values corresponding to 10:00. But the default is like 09:00 to 09:50 is calculated as 09:00. I want this at 10:00.
The expected output is:
ts Values
2021-01-01 10:00:00+00:00 0.06
2021-01-01 11:00:00+00:00 0.16
2021-01-01 12:00:00+00:00 0.16
Upvotes: 1
Views: 31
Reputation: 260825
You can use:
df['Values'].groupby(df.index.ceil('H')).mean()
output:
ts
2021-01-01 09:00:00+00:00 0.000000
2021-01-01 10:00:00+00:00 0.076667
2021-01-01 11:00:00+00:00 0.165000
Name: Values, dtype: float64
Or to consider 09:00 to be 10:00:
df['Values'].groupby(df.index.floor('H')+pd.Timedelta('1h')).mean()
output:
ts
2021-01-01 10:00:00+00:00 0.056667
2021-01-01 11:00:00+00:00 0.158333
2021-01-01 12:00:00+00:00 0.160000
Name: Values, dtype: float64
Upvotes: 1