Reputation: 1161
I'm trying to join two dataframes - one with multiindex columns and the other with a single column name. They have similar index.
I get the following warning:
"UserWarning: merging between different levels can give an unintended result (3 levels on the left, 1 on the right)"
For example:
import pandas as pd
import numpy as np
arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
np.random.seed(2022) # so the data is the same each time
df = pd.DataFrame(np.random.randn(3, 8), index=['A', 'B', 'C'], columns=index)
df2 = pd.DataFrame(np.random.randn(3), index=['A', 'B', 'C'],columns=['w'])
df3 = df.join(df2)
df
first bar baz foo qux
second one two one two one two one two
A -0.000528 -0.274901 -0.139286 1.984686 0.282109 0.760809 0.300982 0.540297
B 0.373497 0.377813 -0.090213 -2.305943 1.142760 -1.535654 -0.863752 1.016545
C 1.033964 -0.824492 0.018905 -0.383344 -0.304185 0.997292 -0.127274 -1.475886
df2
w
A -1.940906
B 0.833649
C -0.567218
df3
- Result (bar, one) (bar, two) (baz, one) (baz, two) (foo, one) (foo, two) (qux, one) (qux, two) w
A -0.000528 -0.274901 -0.139286 1.984686 0.282109 0.760809 0.300982 0.540297 -1.940906
B 0.373497 0.377813 -0.090213 -2.305943 1.142760 -1.535654 -0.863752 1.016545 0.833649
C 1.033964 -0.824492 0.018905 -0.383344 -0.304185 0.997292 -0.127274 -1.475886 -0.567218
df.join(df2)
from pandas v1.3.0
results in a FutureWarning
FutureWarning: merging between different levels is deprecated and will be removed in a future version. (2 levels on the left, 1 on the right) df3 = df.join(df2)
.What is the best way to join these two dataframes?
Upvotes: 20
Views: 22336
Reputation: 863226
I think the simplest way is to convert df2
to MultiIndex
, and then use concat
or join
:
df2.columns = pd.MultiIndex.from_tuples([('a','w')])
print (df2)
a
w
A -1.940906
B 0.833649
C -0.567218
Or:
df2.columns = [['a'], df2.columns]
print (df2)
a
w
A -1.940906
B 0.833649
C -0.567218
df3 = pd.concat([df, df2], axis=1)
Or:
df3 = df.join(df2)
Result:
print (df3)
first bar baz foo qux a
second one two one two one two one two w
A -0.000528 -0.274901 -0.139286 1.984686 0.282109 0.760809 0.300982 0.540297 -1.940906
B 0.373497 0.377813 -0.090213 -2.305943 1.142760 -1.535654 -0.863752 1.016545 0.833649
C 1.033964 -0.824492 0.018905 -0.383344 -0.304185 0.997292 -0.127274 -1.475886 -0.567218
Upvotes: 3
Reputation: 294488
It depends on what you want! Do you want the column from df2
to be aligned with the 1st or second level of columns from df
?
You have to add a level to the columns of df2
Super cheezy with pd.concat
df.join(pd.concat([df2], axis=1, keys=['a']))
Better way
df2.columns = pd.MultiIndex.from_product([['a'], df2.columns])
df.join(df2)
Upvotes: 14