Reputation: 197
I have two data frames that I want to join them along columns. The index is not unique:
df1 = pd.DataFrame({'A': ['0', '1', '2', '2'],'B': ['B0', 'B1', 'B2', 'B3'],'C': ['C0', 'C1', 'C2', 'C3']}):
A B C
0 0 B0 C0
1 1 B1 C1
2 2 B2 C2
3 2 B3 C3
df2 = pd.DataFrame({'A': ['0', '2', '3'],'E': ['E0', 'E1', 'E2']},index=[0, 2, 3])
A E
0 0 E0
1 2 E1
2 3 E2
A should be my index. what I want is:
A B C E
0 0 B0 C0 E0
1 1 B1 C1 NAN
2 2 B2 C2 E1
3 2 B3 C3 E1
This pd.concat([df1, df2], 1)
gives me error:
Reindexing only valid with uniquely valued Index objects
Upvotes: 1
Views: 4715
Reputation: 402263
Maybe you're looking for a left outer merge.
df1.merge(df2, how='left')
A B C E
0 0 B0 C0 E0
1 1 B1 C1 NaN
2 2 B2 C2 E1
3 2 B3 C3 E1
Upvotes: 4
Reputation: 323226
By using combine_first
df1.combine_first(df2).dropna(subset=['A'],axis=0)
Out[320]:
A B C D E
0 A0 B0 C0 D0 E0
1 A1 B1 C1 NaN NaN
2 A2 B2 C2 D1 E1
2 A3 B3 C3 D1 E1
After you edit:
By using combine_first
df1.combine_first(df2.set_index('A'))
Out[338]:
A B C E
0 0 B0 C0 E0
1 1 B1 C1 NaN
2 2 B2 C2 E1
3 2 B3 C3 E2
Or
pd.concat([df1,df2.set_index('A')],axis=1)
Out[339]:
A B C E
0 0 B0 C0 E0
1 1 B1 C1 NaN
2 2 B2 C2 E1
3 2 B3 C3 E2
Upvotes: 2