Reputation: 1347
Let's say I have three DataFrames:
import pandas as pd
import numpy as np
cols = ['A','B','C']
index = [1,2,3,4,5]
np.random.seed(42)
apple = pd.DataFrame(np.random.randn(5,3), index=index, columns=cols)
orange = pd.DataFrame(np.random.randn(5,3), index=index, columns=cols)
banana = pd.DataFrame(np.random.randn(5,3), index=index, columns=cols)
In [50]: apple
Out[50]:
A B C
1 0.496714 -0.138264 0.647689
2 1.523030 -0.234153 -0.234137
3 1.579213 0.767435 -0.469474
4 0.542560 -0.463418 -0.465730
5 0.241962 -1.913280 -1.724918
In [51]: orange
Out[51]:
A B C
1 -0.562288 -1.012831 0.314247
2 -0.908024 -1.412304 1.465649
3 -0.225776 0.067528 -1.424748
4 -0.544383 0.110923 -1.150994
5 0.375698 -0.600639 -0.291694
In [52]: banana
Out[52]:
A B C
1 -0.601707 1.852278 -0.013497
2 -1.057711 0.822545 -1.220844
3 0.208864 -1.959670 -1.328186
4 0.196861 0.738467 0.171368
5 -0.115648 -0.301104 -1.478522
What's the best/fastest/easiest way to create a new dataframe with the same columns and index, but with the maximum value from each column and index for apple, orange, banana? I.e., for [1,A] the new dataframe value would be 0.496714, for [1,B] the value would be 1.852278, etc. Thanks!
Upvotes: 2
Views: 68
Reputation: 1
Why not concatenate the DataFrames
into a Panel
and then use Panel.max()
?
ie:pd.Panel({'a':apple ,'b':banana,'o';orange}).max(axis=0)
Admittedly not the fastest, but this guarantees correct index alignment, and you might want to use the Panel
for something else later. Your data looks to be 3D, with 3 indexing elements (cols/index/fruit), so use a 3D data structure.
Upvotes: 0
Reputation: 117370
I think something like this should be fast:
np.maximum(np.maximum(orange, apple), banana)
Using numpy.maximum():
Element-wise maximum of array elements.
As @Jeff suggested in comments, in general it would be:
reduce(np.maximum, [orange,apple,banana])
Upvotes: 3