N M
N M

Reputation: 616

Merge pandas dataframe with overwrite of columns

What is the quickest way to merge to python data frames in this manner? I have two data frames with similar structures (both have a primary key id and some value columns). What I want to do is merge the two data frames based on id. Are there any ways do this based on pandas operations? How I've implemented it right now is as coded below:

import pandas as pd

a = pd.DataFrame({'id': [1,2,3], 'letter': ['a', 'b', 'c']})
b = pd.DataFrame({'id': [1,3,4], 'letter': ['A', 'C', 'D']})
a_dict =  {e[id]: e for e in a.to_dict('record')}
b_dict =  {e[id]: e for e in b.to_dict('record')}
c_dict = a_dict.copy()
c_dict.update(b_dict)
c = pd.DataFrame(list(c.values())

Here, c would be equivalent to

pd.DataFrame({'id': [1,2,3,4], 'letter':['A','b', 'C', 'D']})

   id letter
0   1      A
1   2      b
2   3      C
3   4      D

Upvotes: 6

Views: 4864

Answers (3)

piRSquared
piRSquared

Reputation: 294218

combine_first

If 'id' is your primary key, then use it as your index.

b.set_index('id').combine_first(a.set_index('id')).reset_index()

   id letter
0   1      A
1   2      b
2   3      C
3   4      D

merge with groupby

a.merge(b, 'outer', 'id').groupby(lambda x: x.split('_')[0], axis=1).last()

   id letter
0   1      A
1   2      b
2   3      C
3   4      D

Upvotes: 7

niraj
niraj

Reputation: 18208

One way may be as following:

  • append dataframe a to dataframe b
  • drop duplicates based on id
  • sort values on remaining by id
  • reset index and drop older index

You can try:

import pandas as pd

a = pd.DataFrame({'id': [1,2,3], 'letter': ['a', 'b', 'c']})
b = pd.DataFrame({'id': [1,3,4], 'letter': ['A', 'C', 'D']})

c = b.append(a).drop_duplicates(subset='id').sort_values('id').reset_index(drop=True)

print(c)

Upvotes: 1

Khalil Al Hooti
Khalil Al Hooti

Reputation: 4506

Try this

c = pd.concat([a, b], axis=0).sort_values('letter').drop_duplicates('id', keep='first').sort_values('id')

c.reset_index(drop=True, inplace=True)

print(c)

    id  letter
0   1   A
1   2   b
2   3   C
3   4   D

Upvotes: 0

Related Questions