pandas comparing column value with 0

Question

I am having following dataframe

data = {'sc':['a','a','a','a','b','b','b','b'],
    't1':['O','O','O','X','O','X','O','O'],
    'q1':[10,15,12,12,14,15,16,9],
    's1':[280,310,292,245,267,288,291,298],
    's2':[290,315,294,247,268,285,290,296],
    }
df=pd.DataFrame(data)
df

   sc   t1  q1  s1  s2
0   a   O   10  280 290
1   a   O   15  310 315
2   a   O   12  292 294
3   a   X   12  245 247
4   b   O   14  267 268
5   b   X   15  288 285
6   b   O   16  291 290
7   b   O   9   298 296

I want to create a new column "s3" based on conditions

data['s3']=max(s1-s2,0) where t1="O" and
data['s3']=max(q1,14) where t1="X"

Can you pl help ?

willeM_ Van Onsem · Accepted Answer

We can make use of np.where [numpy-doc] here:

import numpy as np

df['s3'] = np.where(
    df['t1'] == 'O',
    df['s1'].sub(df['s2']).clip(lower=0),
    df['q1'].clip(lower=14)
)

This then yields:

>>> df
  sc t1  q1   s1   s2  s3
0  a  O  10  280  290   0
1  a  O  15  310  315   0
2  a  O  12  292  294   0
3  a  X  12  245  247  14
4  b  O  14  267  268   0
5  b  X  15  288  285  15
6  b  O  16  291  290   1
7  b  O   9  298  296   2

If s3 already exists, we can use np.select instead:

df['s3'] = np.select(
    [df['t1'] == 'O', df['t1'] == 'X'],
    [
       df['s1'].sub(df['s2']).clip(lower=0),
       df['q1'].clip(lower=14)
    ],
    default=df['s3']
)

pandas comparing column value with 0

Answers (1)

Related Questions