Pandas: Subtracting Two Mismatched Dataframes

Question

Forgive me if this is a repeat question, but I can't find the answer and I'm not even sure what the right terminology is.

I have two dataframes that don't have completely matching rows or columns. Something like:

Balances = pd.DataFrame({'Name':['Alan','Barry','Carl', 'Debbie', 'Elaine'],
              'Age Of Debt':[1,4,3,7,2],
              'Balance':[500,5000,300,100,3000],
              'Payment Due Date':[1,1,30,14,1]})

Payments = pd.DataFrame({'Name':['Debbie','Alan','Carl'],
              'Balance':[50,100,30]})

I want to subtract the Payments dataframe from the Balances dataframe based on Name, so essentially a new dataframe that looks like this:

pd.DataFrame({'Name':['Alan','Barry','Carl', 'Debbie', 'Elaine'],
              'Age Of Debt':[1,4,3,7,2],
              'Balance':[400,5000,270,50,3000],
              'Payment Due Date':[1,1,30,14,1]})

I can imagine having to iterate over the rows of Balances, but when both dataframes are very large I don't think it's very efficient.

Andrej Kesely · Accepted Answer

You can use .merge:

tmp = pd.merge(Balances, Payments, on="Name", how="outer").fillna(0)
Balances["Balance"] = tmp["Balance_x"] - tmp["Balance_y"]
print(Balances)

Prints:

     Name  Age Of Debt  Balance  Payment Due Date
0    Alan            1    400.0                 1
1   Barry            4   5000.0                 1
2    Carl            3    270.0                30
3  Debbie            7     50.0                14
4  Elaine            2   3000.0                 1

Pandas: Subtracting Two Mismatched Dataframes

Answers (1)

Related Questions