Reputation: 41
df1 = pd.DataFrame({"A":[12, 4, 5, 44, 1],
"B":[5, 2, 54, 3, 2] })
df2 = pd.DataFrame({"A":[20, 16, 7, 3, 8],
"B":[14, 3, 17, 2, 6]})
Given two dataframes, df1
and df2
, I want to create a third dataframe, df3
which contains the maximum values of the two dataframes.
df3 =pd.DataFrame({"A":[20, 16, 7, 44, 8],
"B":[14, 3, 54, 3, 6]})
I created two temp dataframes with both columns of A from df1
and df2
into a numpy
array, then found each maximum value. The same process was repeated for B. I then combined both arrays for A and B to get df3
. However, I feel this is not elegant and I want a more efficient method for accomplishing this task.
Thank you @beny @Romero_91 @cameron-riddell ... your solutions are very elegant and simpler than mine! I knew I was missing something!!
Upvotes: 4
Views: 571
Reputation: 13407
You can use the dataframe method combine
to perform an elementwise maximum:
import numpy as np
df3 = df1.combine(df2, np.maximum)
print(df3)
A B
0 20 14
1 16 3
2 7 54
3 44 3
4 8 6
As pointed out by @anky, np.maximum
on its own performs this elemnt-wise comparison. It's always good to remember those pure numpy solutions especially when they lead to such clean & simple code.
df3 = np.maximum(df1, df2)
print(df3)
A B
0 20 14
1 16 3
2 7 54
3 44 3
Upvotes: 3
Reputation: 425
You can use the where
method of numpy
as:
import numpy as np
df3 = pd.DataFrame(np.where(df1>df2, df1, df2), columns=['A', 'B'])
Upvotes: 1
Reputation: 323236
Let us do
out = df2.mask(df1>df2,df1)
Out[141]:
A B
0 20 14
1 16 3
2 7 54
3 44 3
4 8 6
Upvotes: 2