Antonio López Ruiz
Antonio López Ruiz

Reputation: 1466

Merging tables of different sizes by column Python

I have two tables like this:

Table A:

id id2 value
1   1   a
1   2   b
2   1   c
3   1   d

Table B:

id value2
1    e
2    g
3    h

And I need to join them so i get something like this:

Table needed:

id   id2  value value2
1     1     a     e
1     2     b     e
2     1     c     g
3     1     d     h

Excel or Python or R would work. The fact is i need that if the id in table A matches with the one in table B, the value in table b gets added to the matchin row(s). Nevertheless, both tables are not of the same size and some times there are id's in table b that do not exist in table a. I only need the ones in table a.

Upvotes: 1

Views: 643

Answers (1)

jezrael
jezrael

Reputation: 862751

Fastest is use map:

df1['value2'] = df1['id'].map(df2.set_index('id')['value2'])
print (df1)
   id  id2 value value2
0   1    1     a      e
1   1    2     b      e
2   2    1     c      g
3   3    1     d      h

EDIT:

print (df2)
   id value2
0   1      e
1   1      p
2   2      g
3   3      h

df1['value2'] = df1['id'].map(df2.set_index('id')['value2'])
print (df1)

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Ans solution is remove them:

print (df2)
   id value2
0   1      e
1   1      p
2   2      g
3   3      h

df2 = df2.drop_duplicates(subset='id')
print (df2)
   id value2
0   1      e
2   2      g
3   3      h

df1['value2'] = df1['id'].map(df2.set_index('id')['value2'])
print (df1)
   id  id2 value value2
0   1    1     a      e
1   1    2     b      e
2   2    1     c      g
3   3    1     d      h

Solution with repeating values with merge and left join:

df = pd.merge(df1, df2, on='id', how='left')
print (df)
   id  id2 value value2
0   1    1     a      e
1   1    1     a      p
2   1    2     b      e
3   1    2     b      p
4   2    1     c      g
5   3    1     d      h

Upvotes: 1

Related Questions