gabboshow
gabboshow

Reputation: 5559

concatenate multi-index dataframe with dataframe

I am trying to concatenate 2 dataframe df1 and df2 df1 is a multiindex dataframe and df2 has less rows than df1

import pandas as pd
import numpy as np
arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
          ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
df1 = pd.DataFrame(np.random.randn(8), index=index)

df1
Out[15]: 
                     0
first second          
bar   one    -0.185560
      two    -2.358254
baz   one     1.130550
      two     1.441708
foo   one    -1.163076
      two     1.776814
qux   one    -0.811836
      two     0.389500

df2 = pd.DataFrame(data=[0,1,0,1],index=['bar','baz','foo', 'qux'],columns=['label'])

df2
Out[18]: 
     label
bar      0
baz      1
foo      0
qux      1

The desired result would be something like:

df3
Out[18]: 
                     0      label
first second          
bar   one    -0.185560          0
      two    -2.358254          0
baz   one     1.130550          1
      two     1.441708          1
foo   one    -1.163076          0
      two     1.776814          0
qux   one    -0.811836          1
      two     0.389500          1

Upvotes: 1

Views: 51

Answers (2)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210832

In [132]: df1['label'] = df1.index.get_level_values(0).to_series().map(df2['label']).values

In [133]: df1
Out[133]:
                     0  label
first second
bar   one     0.143211      0
      two     1.133454      0
baz   one     1.298973      1
      two    -0.717844      1
foo   one    -0.663768      0
      two     0.687015      0
qux   one     0.412729      1
      two     0.366502      1

or a better option (thanks to @Dark for the hint):

df1['label'] = df1.index.get_level_values(0).map(df2['label'].get)

Upvotes: 2

EdChum
EdChum

Reputation: 393963

Another method is to just reset_index on the second level, you can then just add the column which will align on the first level index values, and then set the index back again:

In[52]:
df3 = df1.reset_index(level=1)
df3['label'] = df2['label']
df3 = df3.set_index([df3.index, 'second'])
df3

Out[52]: 
                     0  label
first second                 
bar   one     0.957417      0
      two    -0.466755      0
baz   one     1.064326      1
      two     1.036983      1
foo   one    -1.319737      0
      two     0.064465      0
qux   one    -0.237232      1
      two    -0.511889      1

Upvotes: 2

Related Questions