kawuel
kawuel

Reputation: 82

Pandas Update on DataFrame to carryover NaN from second DataFrame

When updating one DataFrame based on info from second one the NaN's are not transferred. With below code

import pandas as pd
import numpy as np
df1 = pd.DataFrame({'A':[1,2,3,4,5], 'B':[4,5,6,7,8]})
df2 = pd.DataFrame({'A':[1,2,3], 'B':[7,np.nan,6]})
df1.update(df2)

The new DataFrame will look as below:

enter image description here

So the value from df2 is not transferred as this is a NaN

enter image description here

Normally this is probably an expected outcome but not in the particular case that I'm working on. For the moment how I deal with it is that I replace all NaNs with a value or string that signalizes to me that it's something I need to check but I was wondering if there is an actual solution to pass the NaN as an update?

enter image description here

Upvotes: 0

Views: 61

Answers (2)

Corralien
Corralien

Reputation: 120419

Concat dataframes then remove duplicated index:

>>> pd.concat([df2, df1]).loc[lambda x: ~x.index.duplicated()].sort_index()
   A    B
0  1  7.0
1  2  NaN
2  3  6.0
3  4  7.0
4  5  8.0

Upvotes: 2

PrinsEdje80
PrinsEdje80

Reputation: 514

What you could do is a bit more involved, but solves the problem as well without using fillna or so:

df3 = df1.merge(df2, how='left', suffixes=('___',''), on=['A'], indicator=True)
df3['finalB'] = df3.apply(lambda x: x['B'] if x['_merge'] == 'both' else x['B___'], axis=1)
df3[['A','finalB']]

Output:

     A  finalB
0  1.0     7.0
1  2.0     NaN
2  3.0     6.0
3  4.0     7.0
4  5.0     8.0

Upvotes: 2

Related Questions