Reputation: 1354
I have a dataframe like
df = pd.DataFrame({'time': [1, 5, 100, 250, 253, 260, 700], 'qty': [3, 6, 2, 5, 64, 2, 5]})
df['time_delta'] = df.time.diff()
and I would like to groupby time_delta
such that all rows where the time_delta
is less than 10 are grouped together, time_delta
column could be dropped, and qty
is summed.
The expected result is
pd.DataFrame({'time': [1, 100, 250, 700], 'qty': [9, 2, 71, 5]})
Basically I am hoping there is something like a df.groupby(time_delta_func(10)).agg({'time': 'min', 'qty': 'sum'})
func. I read up on pd.Grouper
but it seems like the grouping based on time is very strict and interval based.
Upvotes: 1
Views: 361
Reputation: 29635
you can do it with gt
meaning greater than and cumsum
to create a new group each time the time-delta is greater than 10
res = (
df.groupby(df['time_delta'].gt(10).cumsum(), as_index=False)
.agg({'time':'first','qty':sum})
)
print(res)
time qty
0 1 9
1 100 2
2 250 71
3 700 5
Upvotes: 2