Ziqun Liu
Ziqun Liu

Reputation: 43

How does df.combine() works?

df1 = pd.DataFrame({'A': [0, 0], 'B': [None, 4]})
df2 = pd.DataFrame({'A': [1, 1], 'B': [3, 0]})
df1.combine(df2, take_smaller, fill_value=-5)

The above code yields result. Where does the 4.0 come from?

Upvotes: 4

Views: 159

Answers (2)

Dishin H Goyani
Dishin H Goyani

Reputation: 7693

I guess you are refering to doc example

take_smaller = lambda s1, s2: s1 if s1.sum() < s2.sum() else s2

if so

Here you are using fill_value=-5 so column B passed in function would be [-5,4] and [3,0]
so -5 + 4 = -1 is less than 3 + 0 = 3 hence [-5, 4] returned.

Upvotes: 4

anky
anky

Reputation: 75080

From example in docs

take_smaller = lambda s1, s2: s1 if s1.sum() < s2.sum() else s2

This says if sum of a series in df1 is less than sum of the series in df2 , return series from df1 else from df2.

So when you do:

df1.combine(df2, take_smaller)

   A    B
0  0  3.0
1  0  0.0

This works fine.

However when you do a fill_value=-5 , then the sum of second series in the first dataframe becomes smaller since fill_value first fills NaN and then compares. (-5+4) < (3+0) , hence -5 and 4 is returned.

fill_value scalar value, default None The value to fill NaNs with prior to passing any column to the merge func.

Upvotes: 6

Related Questions