Kamal Das
Kamal Das

Reputation: 41

Python: Take maximum values of two dataframes to create third dataframe

Problem

df1 = pd.DataFrame({"A":[12, 4, 5, 44, 1], 
                   "B":[5, 2, 54, 3, 2] }) 

df2 = pd.DataFrame({"A":[20, 16, 7, 3, 8], 
                   "B":[14, 3, 17, 2, 6]})

Given two dataframes, df1 and df2, I want to create a third dataframe, df3 which contains the maximum values of the two dataframes.

df3 =pd.DataFrame({"A":[20, 16, 7, 44, 8], 
                   "B":[14, 3, 54, 3, 6]}) 

Attempt

I created two temp dataframes with both columns of A from df1 and df2 into a numpy array, then found each maximum value. The same process was repeated for B. I then combined both arrays for A and B to get df3. However, I feel this is not elegant and I want a more efficient method for accomplishing this task.

Thank you @beny @Romero_91 @cameron-riddell ... your solutions are very elegant and simpler than mine! I knew I was missing something!!

Upvotes: 4

Views: 571

Answers (3)

Cameron Riddell
Cameron Riddell

Reputation: 13407

You can use the dataframe method combine to perform an elementwise maximum:

import numpy as np

df3 = df1.combine(df2, np.maximum)
print(df3)
    A   B
0  20  14
1  16   3
2   7  54
3  44   3
4   8   6

As pointed out by @anky, np.maximum on its own performs this elemnt-wise comparison. It's always good to remember those pure numpy solutions especially when they lead to such clean & simple code.

df3 = np.maximum(df1, df2)
print(df3)
    A   B
0  20  14
1  16   3
2   7  54
3  44   3

Upvotes: 3

Romero_91
Romero_91

Reputation: 425

You can use the where method of numpy as:

import numpy as np

df3 = pd.DataFrame(np.where(df1>df2, df1, df2), columns=['A', 'B'])

Upvotes: 1

BENY
BENY

Reputation: 323236

Let us do

out = df2.mask(df1>df2,df1)
Out[141]: 
    A   B
0  20  14
1  16   3
2   7  54
3  44   3
4   8   6

Upvotes: 2

Related Questions