Reputation: 167
Setup: two pandas dataframes; data from df2 needs to be added to df1, as explained below:
I want to add values from the one column of df2 to all three copies of the rows in df1 where the three corresponding levels match.
Having learned that 'merging with more than one level overlap on a multiIndex is not implemented' in pandas, I propose to map the values, but have not found a way to map on (multiple) index levels, or multiple columns, if reset index levels to columns:
df1 = pd.DataFrame(np.array([['Dec', 'NY', 'Ren', 'Q1', 10],
['Dec', 'NY', 'Ren', 'Q2', 12],
['Dec', 'NY', 'Ren', 'Q3', 14],
['Dec', 'FL', 'Mia', 'Q1', 6],
['Dec', 'FL', 'Mia', 'Q2', 8],
['Dec', 'FL', 'Mia', 'Q3', 17],
['Apr', 'CA', 'SC', 'Q1', 1],
['Apr', 'CA', 'SC', 'Q2', 2],
['Apr', 'CA', 'SC', 'Q3', 3]]), columns=['Date', 'State', 'County', 'Quarter', 'x'])
df1.set_index(['Date', 'State', 'County', 'Quarter'], inplace=True)
df2 = pd.DataFrame(np.array([['Dec', 'NY', 'Ren', 0.4],
['Dec', 'FL', 'Mia', 0.3]]), columns=['Date', 'State', 'County', 'y'])
df2.set_index(['Date', 'State', 'County', 'y'], inplace=True)
df_combined = df1['Date', 'State', 'County'].map(df2)
Upvotes: 1
Views: 2837
Reputation: 2127
You can temporarily change df1
to change the index to do the join:
df_combined = df1.reset_index(3).join(df2,how='left')
>>> df_combined
level_3 x y
Apr CA SC Q1 1 NaN
SC Q2 2 NaN
SC Q3 3 NaN
Dec FL Mia Q1 6 0.3
Mia Q2 8 0.3
Mia Q3 17 0.3
NY Ren Q1 10 0.4
Ren Q2 12 0.4
Ren Q3 14 0.4
df_combined.set_index('level_3',append=True, inplace=True)
df_combined.index.rename(None,3,inplace=True)
>>> df_combined
x y
Apr CA SC Q1 1 NaN
Q2 2 NaN
Q3 3 NaN
Dec FL Mia Q1 6 0.3
Q2 8 0.3
Q3 17 0.3
NY Ren Q1 10 0.4
Q2 12 0.4
Q3 14 0.4
The reset_index method is used to temporarily turn the index that isn't in df2
into a column so that you can do a normal join. Then turn the column back into an index when you're done.
Upvotes: 2