Reputation: 1216
I need to set condition column to True after the time at which the price was 20 or higher daily like below.
I want to avoid using apply function because I got several millions data. I think apply requires too much time.
Upvotes: 1
Views: 46
Reputation: 862661
Use GroupBy.cummax
or GroupBy.cumsum
per days and compare for greater or equal by Series.ge
:
df['datetime'] = pd.to_datetime(df['datetime'])
df['condition'] = df.groupby([df['datetime'].dt.date])['price'].cummax().ge(20)
If need test also per compid
:
df['condition'] = df.groupby(['compid', df['datetime'].dt.date])['price'].cummax().ge(20)
print (df)
compid datetime price condition
0 1 2020-11-06 00:00:00 10 False
1 1 2020-11-06 00:00:10 20 True
2 1 2020-11-06 00:00:20 5 True
3 1 2020-11-07 00:00:00 20 True
4 1 2020-11-07 00:00:10 5 True
5 1 2020-11-07 00:00:20 25 True
Upvotes: 1
Reputation: 34056
You can use np.where
with df.cumsum
:
In [1306]: import numpy as np
In [1307]: df['condition'] = np.where(df.groupby(df.datetime.dt.date).price.cumsum().ge(20), 'TRUE', 'FALSE')
In [1308]: df
Out[1308]:
compid datetime price condition
0 1 2020-11-06 00:00:00 10 FALSE
1 1 2020-11-06 00:00:10 20 TRUE
2 1 2020-11-06 00:00:20 5 TRUE
3 1 2020-11-07 00:00:00 20 TRUE
4 1 2020-11-07 00:00:10 5 TRUE
5 1 2020-11-07 00:00:20 25 TRUE
OR, if you need bool
values in condition
column, do this:
In [1309]: df['condition'] = np.where(df.groupby(df.datetime.dt.date).price.cumsum().ge(20), True, False)
In [1310]: df
Out[1310]:
compid datetime price condition
0 1 2020-11-06 00:00:00 10 False
1 1 2020-11-06 00:00:10 20 True
2 1 2020-11-06 00:00:20 5 True
3 1 2020-11-07 00:00:00 20 True
4 1 2020-11-07 00:00:10 5 True
5 1 2020-11-07 00:00:20 25 True
Upvotes: 1