A1122
A1122

Reputation: 1354

groupby column if value is less than some value

I have a dataframe like

df = pd.DataFrame({'time': [1, 5, 100, 250, 253, 260, 700], 'qty': [3, 6, 2, 5, 64, 2, 5]})
df['time_delta'] = df.time.diff()

and I would like to groupby time_delta such that all rows where the time_delta is less than 10 are grouped together, time_delta column could be dropped, and qty is summed. The expected result is

pd.DataFrame({'time': [1, 100, 250, 700], 'qty': [9, 2, 71, 5]})

Basically I am hoping there is something like a df.groupby(time_delta_func(10)).agg({'time': 'min', 'qty': 'sum'}) func. I read up on pd.Grouper but it seems like the grouping based on time is very strict and interval based.

Upvotes: 1

Views: 361

Answers (1)

Ben.T
Ben.T

Reputation: 29635

you can do it with gt meaning greater than and cumsum to create a new group each time the time-delta is greater than 10

res = (
    df.groupby(df['time_delta'].gt(10).cumsum(), as_index=False)
      .agg({'time':'first','qty':sum})
)
print(res)
   time  qty
0     1    9
1   100    2
2   250   71
3   700    5

Upvotes: 2

Related Questions