Reputation: 626

Calculate the subtraction of two columns in two dataframes by replacement

I have df1 which contains:

IDs    values
E21    32
DD12   82
K99    222

And df2 that contains:

IDs   values
GU1   87
K99   93
E21   48

What I need is to check if the ID in df2 exists in df1, do the subtraction of the value of df1 - df2 for that ID and update the value in df2.

If the ID of df2 does not exist in df1, the value for that ID in df2 remains the same.

So, the result from the above example (basically df2 will be updated):

IDs    values
GU1    87 #the same not changed since it not exist in df1
K99    129 #exist in df1, do the subtraction 222-93=129
E21    -16 #exist in df1, do the subtraction 32-48=129

Any help, please?

Upvotes: 3

Answers (4)

Harpreet Singh

Reputation: 150

# create new column in df2 with name 'new'
df2['new'] = df2['values']
# loop on the values of 'IDs' column
for i, element in enumerate(df2.IDs):
    # condition to check if an element exists in df1  
    if element in df1.IDs.values:
        df2['new'][i]  = df1['values'][df1.index[df1.IDs == element][0]] - df2['values'][i]
# drop the old values column 
df2.drop('values', axis = 1, inplace= True)
# rename the new values column
df2.rename(columns={'new': 'values'}, inplace= True)

Upvotes: 2

Mykola Zotko

Reputation: 17911

You can use the method update:

df2.update(df1 - df2)

Output:

     values
IDs        
GU1    87.0
K99   129.0
E21   -16.0

Upvotes: 1

piRSquared

Reputation: 294576

IIUC:

d = df1.set_index('IDs')['values']
i = df2.itertuples(index=False)
df2.assign(values=[d[x] - v if x in d.index else v for x, v in i])

   IDs  values
0  GU1      87
1  K99     129
2  E21     -16

Same exact idea but using dict instead of pandas.Series

d = dict(zip(df1['IDs'], df1['values']))
i = df2.itertuples(index=False)
df2.assign(values=[d[x] - v if x in d else v for x, v in i])

Upvotes: 1

YOLO

Reputation: 21759

Here's a way using pd.merge:

# merge the data frames
dfx = pd.merge(df2, df1, on='IDs', how='left', suffixes=('','_2'))

# modify new columns
dfx['val'] = dfx['values_2'] - dfx['values']
dfx['val'] = dfx['val'].combine_first(dfx['values'])
dfx = dfx[['IDs','val']].rename(columns={'val':'values'})

print(dfx)

   IDs  values
0  GU1    87.0
1  K99   129.0
2  E21   -16.0

Upvotes: 1

Calculate the subtraction of two columns in two dataframes by replacement

Answers (4)

Related Questions