Reputation: 4408
I have a dataset, df, that contains multiple groups. I would like to set a threshold for each group. If the threshold is above or below a certain value, a certain text should appear.
group start end diff percent date
A 2019-04-01 2019-05-01 -160 -11 04-01-2019 to 05-01-2019
A 2019-05-01 2019-06-01 136 8 05-01-2019 to 06-01-2019
B 2020-06-01 2020-07-01 202 5 06-01-2020 to 07-01-2020
B 2020-07-01 2020-08-01 283 7 07-01-2020 to 08-01-2020
I would like to set an upper threshold to any value >250 and a lower threshold to any value <0.
Desired results:
group start end diff percent date result
A 2019-04-01 2019-05-01 -160 -11 04-01-2019 to 05-01-2019 unacceptable
A 2019-05-01 2019-06-01 136 8 05-01-2019 to 06-01-2019 acceptable
B 2020-06-01 2020-07-01 202 5 06-01-2020 to 07-01-2020 acceptable
B 2020-07-01 2020-08-01 283 7 07-01-2020 to 08-01-2020 unacceptable
This is what I am doing:
df['result'] = df.where(df['percent']> 250,'unacceptable')
This is not working, and I am researching this. Any suggestion is appreciated.
Upvotes: 0
Views: 340
Reputation: 26676
Lets try binning
df['result']=pd.cut(df.start, [-np.inf, 0, 250,np.inf], labels=['unacceptablelow','acceptable', 'unacceptablehigh'])
group start end diff percent date \
A 2019-04-01 2019-05-01 -160 -11 04-01-2019 to 05-01-2019
2019-05-01 2019-06-01 136 8 05-01-2019 to 06-01-2019
B 2020-06-01 2020-07-01 202 5 06-01-2020 to 07-01-2020
2020-07-01 2020-08-01 283 7 07-01-2020 to 08-01-2020
result
A 2019-04-01 unacceptablelow
2019-05-01 acceptable
B 2020-06-01 acceptable
2020-07-01 unacceptablehigh
Upvotes: 1
Reputation: 15498
Why not use df.loc
instead?
df.loc[df['percent']>250,'percent'] = 'unacceptable'
Upvotes: 1