excelpanda
excelpanda

Reputation: 73

How to concatenate with MultiIndex in Pandas

I have 2 dataframes like this:

df1

ID      Value1      Amount2
1        100         10
2        400         20
3        300         50
df2

ID      Value1     Amount2
2        200         20
3        300         30

I want to get a table like this form these two dfs.

Desired Output:

ID      Value        Amount       Difference_Value         Difference_Amount
      df1    df2    df1   df2        
1     100     0     10     0            100                      10
2     400    200    20    20            200                       0
3     300    300    50    30             0                       20

I am a bit new to Multilevel index. I know this is possible but didn't find other questions helpful for my need.

I need this Value, Amount, Difference_Value and Difference_amount columns to be merged cells in excel so that I need to know this.

Thank you.

Upvotes: 2

Views: 803

Answers (2)

Ch3steR
Ch3steR

Reputation: 20669

You can try using df.merge then split in column using pd.index.str.split

Use df.assign with pd.Series.sub for assigning difference values.

d = df.merge(df1,how='outer',on='ID',suffixes=('-df1','-df2')
).fillna(0)
d
   ID  Value1-df1  Amount2-df1  Value1-df2  Amount2-df2
0   1         100           10         0.0          0.0
1   2         400           20       200.0         20.0
2   3         300           50       300.0         30.0
d = d.assign(diff_value = d['Value1-df1'].sub(d['Value1-df2']),
             diff_amount = d['Amount2-df1'].sub(d['Amount2-df2'])).set_index('ID')
d
    Value1-df1  Amount2-df1  Value1-df2  Amount2-df2  diff_value  diff_amount
ID
1          100           10         0.0          0.0       100.0         10.0
2          400           20       200.0         20.0       200.0          0.0
3          300           50       300.0         30.0         0.0         20.0

Now, split the column at '-' with expand=True for getting MultiIndex then use df.sort_index.

d.columns = d.columns.str.split('-',expand=True) #expand= True makes it MultiIndex
d.sort_index(axis=1)

   Amount2       Value1        diff_amount diff_value
       df1   df2    df1    df2         NaN        NaN
ID
1       10   0.0    100    0.0        10.0      100.0
2       20  20.0    400  200.0         0.0      200.0
3       50  30.0    300  300.0        20.0        0.0

Upvotes: 1

jezrael
jezrael

Reputation: 862521

It is possible if MultiIndex for all columns:

First convert ID to index by DataFrame.set_index, subtract by DataFrame.sub and join together by concat, last for change MultiIndex is used DataFrame.swaplevel and DataFrame.sort_index:

df1 = df1.set_index('ID')
df2 = df2.set_index('ID')

df3 = df1.sub(df2, fill_value=0)

df = (pd.concat([df1, df2, df3], axis=1, keys=(['df1','df2', 'diff']))
        .swaplevel(1,0, axis=1)
        .fillna(0)
        .sort_index(axis=1))
print (df)
   Amount2             Value1              
       df1   df2  diff    df1    df2   diff
ID                                         
1       10   0.0  10.0    100    0.0  100.0
2       20  20.0   0.0    400  200.0  200.0
3       50  30.0  20.0    300  300.0    0.0

If try join together MultiIndex and no MultiIndex Dataframes, get tuples instead MultiIndex:

df1 = df1.set_index('ID')
df2 = df2.set_index('ID')

df3 = df1.sub(df2, fill_value=0)

df = (pd.concat([df1, df2, df3], axis=1, keys=(['df1','df2']))
        .swaplevel(1,0, axis=1)
        .fillna(0)
        .sort_index(axis=1)
        .join(df3.add_prefix('Diff_')))
print (df)
    (Amount2, df1)  (Amount2, df2)  (Value1, df1)  (Value1, df2)  Diff_Value1  \
ID                                                                              
1               10             0.0            100            0.0        100.0   
2               20            20.0            400          200.0        200.0   
3               50            30.0            300          300.0          0.0   

    Diff_Amount2  
ID                
1           10.0  
2            0.0  
3           20.0 

Upvotes: 2

Related Questions