lmart999
lmart999

Reputation: 7051

Pandas join() works, but concat() fails

I want to perform outer join of two dataframes with the same row index using Pandas 0.14.1.

The shape of df1 is 456,1 and df2 is 139,5.

Most of the keys in df2 are found in df1:

[in] print len(list(set(df2.index)-set(df1.index))) 
[out] 16

join works:

[in] df3=df1.join(df2,how='outer')
[in] df3.shape
[out] 473,6

concat fails:

[in] df3=pd.concat([df1,df2],axis=1,join='outer')
[out] ValueError: Shape of passed values is (6, 473), indices imply (6, 472)

What may cause this?

Upvotes: 2

Views: 1123

Answers (1)

unutbu
unutbu

Reputation: 879341

You could get this error if one of the indexes has duplicate values. For instance,

import pandas as pd
df1 = pd.DataFrame(np.random.random((5,1)), index=list('AACDE'), 
                   columns=['foo'])
df2 = pd.DataFrame(np.random.random((4,1)), index=list('CDEF'), 
                   columns=['bar'])

then

In [50]: df1.join(df2, how='outer')
Out[50]: 
        foo       bar
A  0.846814       NaN
A  0.638571       NaN
C  0.516051  0.573165
D  0.789398  0.095466
E  0.921592  0.970619
F       NaN  0.061434

but

In [51]: pd.concat([df1,df2], axis=1, join='outer')
ValueError: Shape of passed values is (2, 6), indices imply (2, 5)

Upvotes: 4

Related Questions