Reputation: 345
I am having following dataframe
data = {'sc':['a','a','a','a','b','b','b','b'],
't1':['O','O','O','X','O','X','O','O'],
'q1':[10,15,12,12,14,15,16,9],
's1':[280,310,292,245,267,288,291,298],
's2':[290,315,294,247,268,285,290,296],
}
df=pd.DataFrame(data)
df
sc t1 q1 s1 s2
0 a O 10 280 290
1 a O 15 310 315
2 a O 12 292 294
3 a X 12 245 247
4 b O 14 267 268
5 b X 15 288 285
6 b O 16 291 290
7 b O 9 298 296
I want to create a new column "s3" based on conditions
data['s3']=max(s1-s2,0) where t1="O" and
data['s3']=max(q1,14) where t1="X"
Can you pl help ?
Upvotes: 2
Views: 833
Reputation: 476719
We can make use of np.where
[numpy-doc] here:
import numpy as np
df['s3'] = np.where(
df['t1'] == 'O',
df['s1'].sub(df['s2']).clip(lower=0),
df['q1'].clip(lower=14)
)
This then yields:
>>> df
sc t1 q1 s1 s2 s3
0 a O 10 280 290 0
1 a O 15 310 315 0
2 a O 12 292 294 0
3 a X 12 245 247 14
4 b O 14 267 268 0
5 b X 15 288 285 15
6 b O 16 291 290 1
7 b O 9 298 296 2
If s3
already exists, we can use np.select
instead:
df['s3'] = np.select(
[df['t1'] == 'O', df['t1'] == 'X'],
[
df['s1'].sub(df['s2']).clip(lower=0),
df['q1'].clip(lower=14)
],
default=df['s3']
)
Upvotes: 3