Roby
Roby

Reputation: 2061

Pandas Multindex join

I have the following DataFrames:

Index is Date and Id

               Value
Date       Id
2017-01-01 1     1.4
2017-02-01 1     1.4
2017-03-01 1     1.4
2017-01-01 2     1.4
2017-01-02 2     1.4
2017-01-03 2     1.4

and a second DatFrame containing general information for Id (index):

   V1 V2
Id
1   x  y
2   y  x

and I want to create a DataFrame that looks like (with Index Date and Id):

   Date      Id   Value  V1   V2
2017-01-01    1    1.4    x    y
2017-02-01    1    1.4    x    y
2017-03-01    1    1.4    x    y
2017-01-01    2    1.4    y    x
2017-01-02    2    1.4    y    x
2017-01-03    2    1.4    y    x

I've tried it with pd.join(df1, df2, on='Id') but get a key error.

Upvotes: 1

Views: 79

Answers (2)

Nickil Maveli
Nickil Maveli

Reputation: 29711

The on parameter in DF.join() method takes column names (or) a list of column names, but you've supplied the index level name "Id" to it. Hence it throws that KeyError indicating that it cannot find this string while querying for the column names to be used for the very joining purpose.

Instead, do not specify this as join by default infers an alignment over the indices unless specified otherwise using the on parameter.

For your case, a simple join considering the left DF's index would yield the right result:

df1.join(df2).reset_index()

enter image description here


Another alternative wherein you want to specify a particular level name to be used for joining could be achieved by using the DF.reindex() method as shown:

df1.join(df2.reindex(index=df1.index, level='Id')).reset_index()

enter image description here

Upvotes: 1

Henry
Henry

Reputation: 1686

I dont know if my way here is the recommended one, but I have certainly got this working:

df1.reset_index(levels=['Date','Id'])
#This makes your index just columns in the dataframe
df2.reset_index....
df3=pd.merge(df1,df2,on=['Date','Id'])

Upvotes: 0

Related Questions