Python Pandas: conditional subtraction

Question

I want to perform conditional subtraction on the dataframe (shown as the first picture).

Basically, this is what I want to do:

Subtract the values of col1 and col2 of food and clothing between me and you and create new rows for the differences.

Since the first row has 'food' and 'me' and the third row has 'food' and 'you', you subtract the values of col1 and col2 of the third row from the first row (300 - 600 = -300, and 200 - 500 = -300).

Since the second row has 'clothing' and 'me' and the fourth row has 'clothing' and 'you', you subtract the values of col1 and col2 of the fourth row from the second row (500 - 200 = 300 and 600 - 700 = -100).

How do I implement this using Pandas dataframe?

Scott Boston · Accepted Answer

You could do it this way using pd.concat, groupby and taking advantage of Pandas intrinsic alignment of data based on indexes:

Input df:

df = pd.DataFrame({'type1':['food','clothing','food','clothing'],'type2':['me','me','you','you'],'col1':[300,500,600,200],'col2':[200,600,500,700]})


pd.concat([df.set_index(['type1','type2'])
  .groupby('type1')
  .apply(lambda x: x.iloc[0]-x.iloc[1])
  .assign(type2='us')
  .set_index('type2', append=True),
  df.set_index(['type1','type2'])]).reset_index()

For Pandas older that 0.20.0

pd.concat([df.set_index(['type1','type2'])
  .groupby(level=0)
  .apply(lambda x: x.iloc[0]-x.iloc[1])
  .assign(type2='us')
  .set_index('type2', append=True),
  df.set_index(['type1','type2'])]).sort_index(level=[1,0]).reset_index()

Output:

      type1 type2  col1  col2
0  clothing    us   300  -100
1      food    us  -300  -300
2      food    me   300   200
3  clothing    me   500   600
4      food   you   600   500
5  clothing   you   200   700

Python Pandas: conditional subtraction

Answers (2)

Related Questions