SlowlyLearning
SlowlyLearning

Reputation: 135

Subtracting multiple columns between dataframes based on key

I have two dataframes, example:

Df1 -

A   B   C   D
x   j   5   2
y   k   7   3
z   l   9   4
Df2 -

A   B   C   D
z   o   1   1
x   p   2   1
y   q   3   1

I want to deduct columns C and D in Df2 from columns C and D in Df1 based on the key contained in column A.

I also want to ensure that column B remains untouched, example:

Df3 -

A   B   C   D
x   j   3   1
y   k   4   2
z   l   8   3

I found an almost perfect answer in the following thread: Subtracting columns based on key column in pandas dataframe

However what the answer does not explain is if there are other columns in the primary df (such as column B) that should not be involved as an index or with the operation.

Is somebody please able to advise?

I was originally performing a loop which find the value in the other df and deducts it however this takes too long for my code to run with the size of data I am working with.

Upvotes: 1

Views: 372

Answers (1)

jezrael
jezrael

Reputation: 863056

Idea is specify column(s) for maching and column(s) for subtract, convert all not cols columnsnames to MultiIndex, subtract:

match = ['A']
cols = ['C','D']
df1 = Df1.set_index(match + Df1.columns.difference(match + cols).tolist())

df = df1.sub(Df2.set_index(match)[cols], level=0).reset_index()
print (df)
   A  B  C  D
0  x  j  3  1
1  y  k  4  2
2  z  l  8  3

Or replace not matched values to original Df1:

match = ['A']
cols = ['C','D']
df1 = Df1.set_index(match)

df = df1.sub(Df2.set_index(match)[cols], level=0).reset_index().fillna(Df1)
print (df)
   A  B  C  D
0  x  j  3  1
1  y  k  4  2
2  z  l  8  3

Upvotes: 1

Related Questions