aabujamra
aabujamra

Reputation: 4636

Python / Pandas - Merge on index when I have two indexes

I have a dataframe with double index, it looks like this:

bal:

                    ano             unit period
business_id id                                 
9564        302    2012            reais  anual
            303    2011            reais  anual
2361        304    2013            reais  anual
            305    2012            reais  anual
2369        306    2013            reais  anual
            307    2012            reais  anual

I have another dataframe that looks like this:

accounts:

                           A                     B
id                                                                      

302               5964168.52          1.097601e+07
303               5774707.15          1.086787e+07
304               3652575.31          6.608469e+06 
305                321076.15          6.027066e+06
306               3858137.49          9.733126e+06

I want to merge them so they look like this:

                    ano             unit period              A                     B
business_id id                                 
9564        302    2012            reais  anual     5964168.52          1.097601e+07
            303    2011            reais  anual     5774707.15          1.086787e+07
2361        304    2013            reais  anual     3652575.31          6.608469e+06
            305    2012            reais  anual      321076.15          6.027066e+06
2369        306    2013            reais  anual     3858137.49          9.733126e+06 

What I'm trying to do is something like this:

bal=bal.merge(accounts,left_on='id', right_index=True)

However I think that the synthax is not correct, since I'm getting a ValueError:

ValueError: len(right_on) must equal the number of levels in the index of "left"

Can anyone help?

Upvotes: 0

Views: 511

Answers (2)

PIG
PIG

Reputation: 602

inspired by ununtbu. adding merge

bal.reset_index(['business_id','id']).merge(accounts, left_on = 'id', right_index= True).set_index(['id','business_id'])

Upvotes: 0

unutbu
unutbu

Reputation: 879631

Currently, it is not possible to join on specific levels of a MultiIndex. You can only join on the entire index or by columns.

So you'll have to take the business_id out of the MultiIndex before you join:

result = (bal.reset_index('business_id').join(accounts, how='inner')
          .set_index(['business_id'], append=True))

import pandas as pd

bal = pd.DataFrame({'ano': [2012, 2011, 2013, 2012, 2013, 2012], 'business_id': [9564, 9564, 2361, 2361, 2369, 2369], 'id': [302, 303, 304, 305, 306, 307], 'period': ['anual', 'anual', 'anual', 'anual', 'anual', 'anual'], 'unit': ['reais', 'reais', 'reais', 'reais', 'reais', 'reais']}) 
bal = bal.set_index(['business_id', 'id'])

accounts = pd.DataFrame({'A': [5964168.52, 5774707.15, 3652575.31, 321076.15, 3858137.49], 'B': [10976010.0, 10867870.0, 6608469.0, 6027066.0, 9733126.0], 'id': [302, 303, 304, 305, 306]}) 
accounts = accounts.set_index('id')

result = (bal.reset_index('business_id').join(accounts, how='inner')
          .set_index(['business_id'], append=True))

print(result)

yields

                  ano period   unit           A           B
id  business_id                                            
302 9564         2012  anual  reais  5964168.52  10976010.0
303 9564         2011  anual  reais  5774707.15  10867870.0
304 2361         2013  anual  reais  3652575.31   6608469.0
305 2361         2012  anual  reais   321076.15   6027066.0
306 2369         2013  anual  reais  3858137.49   9733126.0

Upvotes: 1

Related Questions