Reputation: 7051
I want to perform outer join of two dataframes with the same row index using Pandas 0.14.1.
The shape of df1
is 456,1
and df2
is 139,5
.
Most of the keys in df2
are found in df1
:
[in] print len(list(set(df2.index)-set(df1.index)))
[out] 16
join
works:
[in] df3=df1.join(df2,how='outer')
[in] df3.shape
[out] 473,6
concat
fails:
[in] df3=pd.concat([df1,df2],axis=1,join='outer')
[out] ValueError: Shape of passed values is (6, 473), indices imply (6, 472)
What may cause this?
Upvotes: 2
Views: 1123
Reputation: 879341
You could get this error if one of the indexes has duplicate values. For instance,
import pandas as pd
df1 = pd.DataFrame(np.random.random((5,1)), index=list('AACDE'),
columns=['foo'])
df2 = pd.DataFrame(np.random.random((4,1)), index=list('CDEF'),
columns=['bar'])
then
In [50]: df1.join(df2, how='outer')
Out[50]:
foo bar
A 0.846814 NaN
A 0.638571 NaN
C 0.516051 0.573165
D 0.789398 0.095466
E 0.921592 0.970619
F NaN 0.061434
but
In [51]: pd.concat([df1,df2], axis=1, join='outer')
ValueError: Shape of passed values is (2, 6), indices imply (2, 5)
Upvotes: 4