Daniel Bourke
Daniel Bourke

Reputation: 406

Update a pandas dataframe with data from another dataframe

I've got two similar DataFrames.

df1.head()
        1        2        3      4
3234    Lorum    Ipsum    Foo    Bar
8839    NaN      NaN      NaN    NaN
9911    Lorum    Ipsum    Bar    Foo
2256    NaN      NaN      NaN    NaN

df2.head()
        1        3        4
8839    Lorum    Ipsum    Foo
2256    Lorum    Ipsum    Bar

I'd like to merge/update the two based on same index and column to update the NaN values.

Ideal outcome:

df3.head()
        1        2        3      4
3234    Lorum    Ipsum    Foo    Bar
8839    Lorum    NaN      Ipsum  Foo
9911    Lorum    Ipsum    Bar    Foo
2256    Lorum    NaN      Ipsum  Bar

df2 doesn't contain all of the columns as df1 but the columns it does contain match and it has matching indexes.

I've tried this:

df3 = df1.update(df2)

But haven't had any success. I've been looking at the docs and think pd.merge or pd.concat may help but I'm a bit confused.

Thank you

Upvotes: 6

Views: 8604

Answers (1)

jezrael
jezrael

Reputation: 862406

You can use combine_first with reindex:

df3 = df2.combine_first(df1).reindex(df1.index)
print (df3)
          1      2      3    4
3234  Lorum  Ipsum    Foo  Bar
8839  Lorum    NaN  Ipsum  Foo
9911  Lorum  Ipsum    Bar  Foo
2256  Lorum    NaN  Ipsum  Bar

Or use your solution, but update working inplace, so if assign to variable it return None:

df1.update(df2)
print (df1)
          1      2      3    4
3234  Lorum  Ipsum    Foo  Bar
8839  Lorum    NaN  Ipsum  Foo
9911  Lorum  Ipsum    Bar  Foo
2256  Lorum    NaN  Ipsum  Bar

print (df1.update(df2))
None

Upvotes: 3

Related Questions