hippocampus
hippocampus

Reputation: 347

Change numeric column based on category after group by

I have a df like this below:

dff = pd.DataFrame({'id':[1,1,2,2], 'categ':['A','B','A','B'],'cost':[20,5, 30,10] })
dff
   id  categ cost
0   1   A     20
1   1   B      5
2   2   A     30
3   2   B     10

What i want is to make a new df where I group by id and then the cost of category B takes the 20% of the price of category A, and at the same time category A loses this amount. I would like my desired output to be like this:

   id   category    price
0   1    A       16
1   1    B        9
2   2    A       24
3   2    B       16

I have done this below but it only reduces the price of by 20%. Any idea how to do what i want?

dff['price'] = np.where(dff['category'] ==  'A', dff['price'] * 0.8, dff['price'])

Upvotes: 1

Views: 142

Answers (2)

ALollz
ALollz

Reputation: 59579

You can transform to broadcast the 'A' value to every row in the group and take 20% of it. Then using map you can subtract for 'A' and add for 'B'

s = df['cost'].where(df.categ.eq('A')).groupby(df['id']).transform('first')*0.2
df['cost'] = df['cost'] + s*df['categ'].map({'A': -1, 'B': 1})

   id categ  cost
0   1     A  16.0
1   1     B   9.0
2   2     A  24.0
3   2     B  16.0

Upvotes: 1

BENY
BENY

Reputation: 323366

Do pivot then modify and stack back

s = df.pivot(*df)
s['B'] = s['B'] + s['A'] * 0.2
s['A'] *= 0.8
s = s.stack().reset_index(name='cost')
s
Out[446]: 
   id categ  cost
0   1     A  16.0
1   1     B   9.0
2   2     A  24.0
3   2     B  16.0 

Upvotes: 3

Related Questions