user9292
user9292

Reputation: 1145

If-else statement with group_by in Pandas dataframe

I’ve a pd df consists four columns: ID, t, x1 and x2.

import pandas as pd
dat = {'ID': [1,1,1,1,2,2,2,3,3,3,3,4,4,4,5,5,6,6,6],
        't': [0,1,2,3,0,1,2,0,1,2,3,0,1,2,0,1,0,1,2],
        'x1' : [3.5,3.5,3.5,3.5,2.01,2.01,2.01,3.9,3.9,3.9,3.9,2.2,2.2,2.2,1.8,1.8,2.1,2.1,2.1],
       'x2': [4,4,4,4,3,3,3,4,4,4,4,3,3,3,2,2,3,3,3]
        }

df = pd.DataFrame(dat, columns = ['ID', 't', 'x1','x2'])

print (df)

I need to create a new column y and group_by ID such that

if t!=max(t) then y=1,
if t==max(t) then y = x1-x2+1.

The output would look like:

enter image description here

Please not that I have million of records, so the faster the solution the better.

Upvotes: 0

Views: 221

Answers (1)

BENY
BENY

Reputation: 323316

We can combine transform max with np.where

df['y'] = np.where(df.t != df.groupby('ID').t.transform('max'), 1, df.x1-df.x2+1)
df
Out[221]: 
    ID  t    x1  x2     y
0    1  0  3.50   4  1.00
1    1  1  3.50   4  1.00
2    1  2  3.50   4  1.00
3    1  3  3.50   4  0.50
4    2  0  2.01   3  1.00
5    2  1  2.01   3  1.00
6    2  2  2.01   3  0.01
7    3  0  3.90   4  1.00
8    3  1  3.90   4  1.00
9    3  2  3.90   4  1.00
10   3  3  3.90   4  0.90
11   4  0  2.20   3  1.00
12   4  1  2.20   3  1.00
13   4  2  2.20   3  0.20
14   5  0  1.80   2  1.00
15   5  1  1.80   2  0.80
16   6  0  2.10   3  1.00
17   6  1  2.10   3  1.00
18   6  2  2.10   3  0.10

Upvotes: 3

Related Questions