Reputation: 535
I am using pandas DataFrame as a lightweight dataset to maintain some status and need to dynamically/continuously merge new DataFrames into existing table. Say I have two datasets as below:
df1:
a b
0 0 1
1 2 3
2 4 5
3 6 7
4 8 9
df2:
b c
0 10 11
1 12 13
2 14 15
3 16 17
4 18 19
I want to merge df2 to df1 (on index), and for columns in common (in this case, it is 'b'), simply discard the common column of df2.
a b c
0 0 1 11
1 2 3 13
2 4 5 15
3 6 7 17
4 8 9 19
My code was checking common part between df1 and df2 by using SET, so that I manually drop common part in df2. I wonder is there any much efficient way to do this?
Upvotes: 3
Views: 7208
Reputation: 192
Pandas merge function will also work wonders. You can do it as:
pd.merge(left=df1, right=df2, how='inner')
a b c
0 0 1 11
1 2 3 13
2 4 5 15
3 6 7 17
4 8 9 19
by eliminating the 'on' attribute of merge function it will consider the columns which are in-common in both of the dataframes.
Upvotes: 6
Reputation: 294328
First identify the columns in df2
not in df1
cols = df2.columns.difference(df1.columns)
Then pd.DataFrame.join
df1.join(df2[cols])
a b c
0 0 1 11
1 2 3 13
2 4 5 15
3 6 7 17
4 8 9 19
Or pd.concat
will also work
pd.concat([df1, df2[cols]], axis=1)
a b c
0 0 1 11
1 2 3 13
2 4 5 15
3 6 7 17
4 8 9 19
Upvotes: 8