How to merge panda dataframe based on column?

Question

What's the pythonic / pandas way of merging multiple panda dataframes? ATM i do it with loops, but it does not feel right:

I have three dataframes with credit runtimes, all of them have an interest , liquidation and date field. The credits have different runtimes (e.g. different rows).

Here's a sample of one credit.

        amount      annuity date        int.    int. %  liq.    liq. %  special_payment
0       50,000.00   135.42  2016-09-01  52.08   1.25    83.33   2.00    0
1       49,916.67   135.42  2016-10-01  52.00   1.25    83.42   2.00    0
2       49,833.25   135.42  2016-11-01  51.91   1.25    83.51   2.00    0
3       49,749.74   135.42  2016-12-01  51.82   1.25    83.59   2.00    0
4       49,666.15   135.42  2017-01-01  51.74   1.25    83.68   2.00    0

I want to calculate the total burnrate of all credits.

That is:

[interest + liquidation of credit 1] +  
[interest + liquidation of credit 2] +  
[interest + liquidation of credit 3]

If the credit does not run on that given date, interest + liquidation should be zero for that.

I'm new to pandas, hence I hope for some insights on how to approach such problem.

Peter9192 · Accepted Answer

I think you can use concat, but you need to add axis=1. I think you need to use the keys argument, to be able to distinguish columns with the same names from different credits.

import pandas as pd
import numpy as np
df1 = pd.DataFrame(np.random.rand(4,4),columns=['a','b','c','d'])
df2 = pd.DataFrame(np.random.rand(4,4),columns=['a','b','c','d'])
df3 = pd.concat([df1,df2],axis=1,keys=['Credit1','Credit2'])

To add numbers from different columns, use, e.g.

burnrate = df3['Credit1','a']+df3['Credit2','a']

How to merge panda dataframe based on column?

Answers (1)

Related Questions