tandem
tandem

Reputation: 2238

Merge multiple dataframes in pandas

I have two data frames: df1 as follows:

  col0 col1  col1.1  col3
0    a    d       1     6
1    b    e       5     7

And df2 as follows:

  colx  coly
0    a    10
1    b    20
2    d    50
3    e    40

How do I combine the dataframes in-place such that the final df looks like this?

  col0 col1  col1.1  col3  colx coly
0    a    d       1     6   10   50
1    b    e       5     7   20   40

As far as I understand, the merge method in pandas merges both dataframes based on a given axis. In this case, I want to merge them based on the value in another dataframe. What is the function I am looking for in this case?

Upvotes: 2

Views: 135

Answers (5)

lmo
lmo

Reputation: 38500

Here is a method to join on the indexes:

df1.join([df2.iloc[0:2,1].rename('colx'),
          df2.iloc[2:,1].rename('coly').reset_index(drop=True)])

Elements of the desired column are extracted, subset, and renamed. The index of the second series is reset to match. Then these are joined to df1.

this returns

  col0 col1  col1.1  col3  colx  coly
0    a    d       1     6    10    50
1    b    e       5     7    20    40

Upvotes: 0

Quang Hoang
Quang Hoang

Reputation: 150735

Another solution with replace:

dct = df2.set_index('colx')['coly'].to_dict()
df1['coly_x'] = df1['col0'].replace(dct)
df1['coly_y'] = df1['col1'].replace(dct)

Output:

  col0 col1  col1.1  col3 coly_x  coly_y
0    a    d       1     6     10      50
1    b    e       5     7     20      40

Upvotes: 1

Scott Boston
Scott Boston

Reputation: 153460

You could use map:

mapper = df2.set_index('colx')['coly']
df1['colx'] = df1['col0'].map(mapper)
df1['coly'] = df1['col1'].map(mapper)

print(df1)

Output:

  col0 col1  col1.1  col3  colx  coly
0    a    d       1     6    10    50
1    b    e       5     7    20    40

Upvotes: 3

anky
anky

Reputation: 75080

Using applymap and get:

s=df2.set_index('colx')['coly']
df1[['colx','coly']]=df1.iloc[:,:2].applymap(lambda x: s.get(x))
print(df1)

  col0 col1  col1.1  col3  colx  coly
0    a    d       1     6    10    50
1    b    e       5     7    20    40

Upvotes: 2

Erfan
Erfan

Reputation: 42916

Using merge chained twice:

mrg = df1.merge(df2, left_on='col0', right_on='colx')\
         .merge(df2, left_on='col1', right_on='colx')\
         .drop(columns=['colx_x', 'colx_y'])

Output

  col0 col1  col1.1  col3  coly_x  coly_y
0    a    d       1     6      10      50
1    b    e       5     7      20      40

Upvotes: 4

Related Questions