Combine_first and null values in Pandas

Question

df1:

     0    1
0   nan 3.00
1 -4.00  nan
2   nan 7.00

df2:

      0   1    2
1 -42.00 nan 8.00
2  -5.00 nan 4.00

df3 = df1.combine_first(df2)

df3:

      0    1    2
0   nan 3.00  nan
1 -4.00  nan 8.00
2 -5.00 7.00 4.00

This is what I'd like df3 to be:

      0    1    2
0   nan 3.00  nan
1 -4.00  nan 8.00
2   nan 7.00 4.00

(The difference is in df3.ix[2:2,0:0])

That is, if the column and index are the same for any cell in both df1 and df2, I'd like df1's value to prevail, even if that value is nan. combine_first does that, except when the value in df1 is nan.

chrisb · Accepted Answer

Here's a bit of a hacky way to do it. First, align df2 with df1, which creates a frame indexed with the union of df1/df2, filled with df2's values. Then assign back df1's values.

In [325]: df3, _ = df2.align(df1)

In [327]: df3.loc[df1.index, df1.columns] = df1

In [328]: df3
Out[328]: 
    0   1   2
0 NaN   3 NaN
1  -4 NaN   8
2 NaN   7   4

Combine_first and null values in Pandas

Answers (1)

Related Questions