Reputation: 4636
I have a dataframe with double index, it looks like this:
bal:
ano unit period
business_id id
9564 302 2012 reais anual
303 2011 reais anual
2361 304 2013 reais anual
305 2012 reais anual
2369 306 2013 reais anual
307 2012 reais anual
I have another dataframe that looks like this:
accounts:
A B
id
302 5964168.52 1.097601e+07
303 5774707.15 1.086787e+07
304 3652575.31 6.608469e+06
305 321076.15 6.027066e+06
306 3858137.49 9.733126e+06
I want to merge them so they look like this:
ano unit period A B
business_id id
9564 302 2012 reais anual 5964168.52 1.097601e+07
303 2011 reais anual 5774707.15 1.086787e+07
2361 304 2013 reais anual 3652575.31 6.608469e+06
305 2012 reais anual 321076.15 6.027066e+06
2369 306 2013 reais anual 3858137.49 9.733126e+06
What I'm trying to do is something like this:
bal=bal.merge(accounts,left_on='id', right_index=True)
However I think that the synthax is not correct, since I'm getting a ValueError:
ValueError: len(right_on) must equal the number of levels in the index of "left"
Can anyone help?
Upvotes: 0
Views: 511
Reputation: 602
inspired by ununtbu. adding merge
bal.reset_index(['business_id','id']).merge(accounts, left_on = 'id', right_index= True).set_index(['id','business_id'])
Upvotes: 0
Reputation: 879631
Currently, it is not possible to join on specific levels of a MultiIndex. You can only join on the entire index or by columns.
So you'll have to take the business_id
out of the MultiIndex before you join:
result = (bal.reset_index('business_id').join(accounts, how='inner')
.set_index(['business_id'], append=True))
import pandas as pd
bal = pd.DataFrame({'ano': [2012, 2011, 2013, 2012, 2013, 2012], 'business_id': [9564, 9564, 2361, 2361, 2369, 2369], 'id': [302, 303, 304, 305, 306, 307], 'period': ['anual', 'anual', 'anual', 'anual', 'anual', 'anual'], 'unit': ['reais', 'reais', 'reais', 'reais', 'reais', 'reais']})
bal = bal.set_index(['business_id', 'id'])
accounts = pd.DataFrame({'A': [5964168.52, 5774707.15, 3652575.31, 321076.15, 3858137.49], 'B': [10976010.0, 10867870.0, 6608469.0, 6027066.0, 9733126.0], 'id': [302, 303, 304, 305, 306]})
accounts = accounts.set_index('id')
result = (bal.reset_index('business_id').join(accounts, how='inner')
.set_index(['business_id'], append=True))
print(result)
yields
ano period unit A B
id business_id
302 9564 2012 anual reais 5964168.52 10976010.0
303 9564 2011 anual reais 5774707.15 10867870.0
304 2361 2013 anual reais 3652575.31 6608469.0
305 2361 2012 anual reais 321076.15 6027066.0
306 2369 2013 anual reais 3858137.49 9733126.0
Upvotes: 1