Morticia A. Addams
Morticia A. Addams

Reputation: 383

Replace dataframe multiple columns with id from another dataframe

I have Pandas Dataframe df1 as:

ID | c1 | c2 | c3
-----------------
1  | A  | B  | 32
2  | C  | D  | 34
3  | A  | B  | 11
4  | E  | F  | 3

And df2:

ID | c1 | c2
------------
1  | A  | B
2  | C  | D
3  | E  | F

There is foreign key between df1 and df2 on columns (c1, c2). Join look like:

pd.merge(df1, df2, left_on=['c1','c2'], right_on = ['c1','c2'])

Result is:

ID_x| c1 | c2 | c3 | ID_y
-------------------------
1   | A  | B  | 32 | 1
2   | C  | D  | 34 | 2
3   | A  | B  | 11 | 1
4   | E  | F  | 3  | 3

I want to replace (c1,c2) in df1 with df2.id. Expected FINAL df1 is:

ID| c3 | df2_id
---------------
1 | 32 | 1
2 | 34 | 2
3 | 11 | 1
4 | 3  | 3

In other words I want to add column 'df2_id' in df1(filled with df2.id value for this row) and drop columns (c1,c2)(they are not necessary anymore).

I have idea to do that by:

  1. save result from merge in df1
  2. drop unnecessary columns (c1,c2)
  3. rename 'ID_y' to 'df2_id' and 'ID_x' to 'ID'

Is there any better solution?

Upvotes: 1

Views: 774

Answers (1)

Erfan
Erfan

Reputation: 42916

We could make a one liner out of your steps by making use of suffixes argument and on instead of left_on, right_on plus using method chaining with drop:

df1.merge(df2, on=['c1','c2'], suffixes=['_1', '_2']).drop(['c1', 'c2'], axis=1)

Output

   ID_1  c3  ID_2
0     1  32     1
1     3  11     1
2     2  34     2
3     4   3     3

To make it exactly like OP's output:

df1.merge(df2, on=['c1','c2'], suffixes=['', '_2']).drop(['c1', 'c2'], axis=1).rename(columns={"id_2": "df2_id"})

Upvotes: 2

Related Questions