Reputation: 383
I have Pandas Dataframe df1 as:
ID | c1 | c2 | c3 ----------------- 1 | A | B | 32 2 | C | D | 34 3 | A | B | 11 4 | E | F | 3
And df2:
ID | c1 | c2 ------------ 1 | A | B 2 | C | D 3 | E | F
There is foreign key between df1 and df2 on columns (c1, c2). Join look like:
pd.merge(df1, df2, left_on=['c1','c2'], right_on = ['c1','c2'])
Result is:
ID_x| c1 | c2 | c3 | ID_y ------------------------- 1 | A | B | 32 | 1 2 | C | D | 34 | 2 3 | A | B | 11 | 1 4 | E | F | 3 | 3
I want to replace (c1,c2) in df1 with df2.id. Expected FINAL df1 is:
ID| c3 | df2_id --------------- 1 | 32 | 1 2 | 34 | 2 3 | 11 | 1 4 | 3 | 3
In other words I want to add column 'df2_id' in df1(filled with df2.id value for this row) and drop columns (c1,c2)(they are not necessary anymore).
I have idea to do that by:
Is there any better solution?
Upvotes: 1
Views: 774
Reputation: 42916
We could make a one liner out of your steps by making use of suffixes
argument and on
instead of left_on, right_on
plus using method chaining with drop
:
df1.merge(df2, on=['c1','c2'], suffixes=['_1', '_2']).drop(['c1', 'c2'], axis=1)
Output
ID_1 c3 ID_2
0 1 32 1
1 3 11 1
2 2 34 2
3 4 3 3
To make it exactly like OP's output:
df1.merge(df2, on=['c1','c2'], suffixes=['', '_2']).drop(['c1', 'c2'], axis=1).rename(columns={"id_2": "df2_id"})
Upvotes: 2